migration to p4-programmable switches and implementation of …1453759/fulltext01.pdf · 2020. 7....

78
Linköpings universitet SE– Linköping + , www.liu.se Linköping University | Department of Computer and Information Science Master’s thesis, 30 ECTS | Computer Science 2020 | LIU-IDA/LITH-EX-A--20/027--SE Migration to P-Programmable Switches and Implementation of the Rapid Spanning Tree Protocol Övergång till P-Programmerbara Switchar och Implementation av Rapid Spanning Tree Protocol Henrik Lindström Supervisor : Petru Eles Examiner : Petru Eles External supervisor : Mikael Johansson

Upload: others

Post on 25-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

Linköpings universitetSE–581 83 Linköping+46 13 28 10 00 , www.liu.se

Linköping University | Department of Computer and Information ScienceMaster’s thesis, 30 ECTS | Computer Science

2020 | LIU-IDA/LITH-EX-A--20/027--SE

Migration to P4-ProgrammableSwitches and Implementation ofthe Rapid Spanning Tree ProtocolÖvergång till P4-Programmerbara Switchar och Implementationav Rapid Spanning Tree Protocol

Henrik Lindström

Supervisor : Petru ElesExaminer : Petru Eles

External supervisor : Mikael Johansson

Page 2: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

Upphovsrätt

Detta dokument hålls tillgängligt på Internet - eller dess framtida ersättare - under 25 år från publicer-ingsdatum under förutsättning att inga extraordinära omständigheter uppstår.Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka ko-pior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervis-ning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annananvändning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säker-heten och tillgängligheten finns lösningar av teknisk och administrativ art.Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning somgod sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentetändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsman-nens litterära eller konstnärliga anseende eller egenart.För ytterligare information om Linköping University Electronic Press se förlagets hemsidahttp://www.ep.liu.se/.

Copyright

The publishers will keep this document online on the Internet - or its possible replacement - for aperiod of 25 years starting from the date of publication barring exceptional circumstances.The online availability of the document implies permanent permission for anyone to read, to down-load, or to print out single copies for his/hers own use and to use it unchanged for non-commercialresearch and educational purpose. Subsequent transfers of copyright cannot revoke this permission.All other uses of the document are conditional upon the consent of the copyright owner. The publisherhas taken technical and administrative measures to assure authenticity, security and accessibility.According to intellectual property law the author has the right to bementionedwhen his/her workis accessed as described above and to be protected against infringement.For additional information about the Linköping University Electronic Press and its proceduresfor publication and for assurance of document integrity, please refer to its www home page:http://www.ep.liu.se/.

© Henrik Lindström

Page 3: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

Abstract

P4 is a high-level language for programming the data plane of a network switch. These P4-programmable switches come with no pre-defined behavior or protocols, so it is entirelyup to the loaded P4 program to define these. This allows the user to exclude any unwantedfunctionality and to create custom protocols. It also removes the dependence on the switchvendor in terms of both trust and addition of new features.

This thesis looks at migration from traditional switches to P4-programmable ones.Since no behavior is included out-of-the-box in the P4 switches, a search is made for open-source P4 projects and the functionality they provide is evaluated. It is found that most linklayer functionality can be achieved with them, with the exception being loop prevention byspanning tree protocols. Therefore, one of the projects is extended with an implementationof the Rapid Spanning Tree Protocol based on the IEEE 802.1D-2004 standard. Finally, par-tial migration of networks to P4 switches and to the Software Defined Networking (SDN)paradigm is studied based on a literature review. Four general approaches and specificarchitectures for these are found, and it is concluded that such a hybrid network can stillbenefit from P4 and having a centralized SDN controller.

Page 4: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

Acknowledgments

I would first like to thank my examinator Petru Eles at Linköping University. Petru was bothmy supervisor and examinator and provided me with good feedback as well as answeringany questions I had during the project. I also want to thank my external supervisor MikaelJohansson as well as Mattias Waldo at Saab for all the help and for providing me with anyresources needed. Finally, I want to thank my friend Rasmus Larsson for helping me throughdiscussions and feedback during the whole project.

iv

Page 5: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

Contents

Abstract iii

Acknowledgments iv

Contents v

List of Figures vii

List of Tables viii

1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Background 42.1 Software Defined Networking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Traditional Ethernet Switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.3 Rapid Spanning Tree Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.4 The P4 Programming Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.5 Behavioral Model V2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.6 Mininet Network Emulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.7 Apache Thrift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3 Related Work 193.1 DC.p4: Programming the Forwarding Plane of a Data-Center Switch . . . . . . 193.2 ARP-P4: deep analysis of a hybrid SDN ARP-Path/P4Runtime switch . . . . . 203.3 Hybrid SDN Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.4 Previous Studies of Hybrid SDN . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4 Method 254.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.2 Setup of Development Environment . . . . . . . . . . . . . . . . . . . . . . . . . 254.3 Studying Open Source P4 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . 264.4 Implementing a P4 Switch With RSTP . . . . . . . . . . . . . . . . . . . . . . . . 274.5 Testing the P4 Switch Implementation . . . . . . . . . . . . . . . . . . . . . . . . 314.6 Studying Approaches for Partial Migration . . . . . . . . . . . . . . . . . . . . . 34

5 P4 Switch Implementation 365.1 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365.2 Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375.3 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

v

Page 6: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5.4 Switch.p4 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.5 Python Controller With RSTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.6 Management CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.7 Test Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6 Results 476.1 Link Layer Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476.2 Open Source P4 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.3 P4 Switch Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526.4 Approaches for Migration to P4 and SDN . . . . . . . . . . . . . . . . . . . . . . 53

7 Discussion 567.1 Open Source P4 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567.2 P4 Switch Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567.3 Approaches for Migration to P4 and SDN . . . . . . . . . . . . . . . . . . . . . . 577.4 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587.5 The Work in a Wider Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

8 Conclusions 608.1 Answers to the Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . 608.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Bibliography 62

A Installation of Development Environment 66A.1 Installing Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66A.2 Installing PI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68A.3 Installing Behavioral Model V2 (BMv2) . . . . . . . . . . . . . . . . . . . . . . . 68A.4 Installing the P4 Compiler (p4c) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69A.5 Installing the Older P4 Compiler (p4c-bm) . . . . . . . . . . . . . . . . . . . . . . 69A.6 Installing Mininet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69A.7 Installing Packet Test Framework (PTF) . . . . . . . . . . . . . . . . . . . . . . . 69

vi

Page 7: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

List of Figures

2.1 A simple LAN with a loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Port roles and states in a stable topology of four switches. . . . . . . . . . . . . . . . 11

4.1 Inspection of BPDU contents using Wireshark. . . . . . . . . . . . . . . . . . . . . . 314.2 The physical network setup used for testing. . . . . . . . . . . . . . . . . . . . . . . 34

5.1 Overview of a network of P4 switches. . . . . . . . . . . . . . . . . . . . . . . . . . . 375.2 Overview of a single P4 switch with BMv2. . . . . . . . . . . . . . . . . . . . . . . . 385.3 Dumping information about port 1 through the CLI. . . . . . . . . . . . . . . . . . . 435.4 Visualization of the nodes and links in the Mininet network. . . . . . . . . . . . . . 45

vii

Page 8: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

List of Tables

2.1 Information carried in STP and RSTP BPDUs. . . . . . . . . . . . . . . . . . . . . . . 92.2 Bit flags in the Flags field of a BPDU. . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

6.1 Link layer functionality in the P4 tutorials. . . . . . . . . . . . . . . . . . . . . . . . 486.2 Link layer functionality in the p4-learning repository. . . . . . . . . . . . . . . . . . 496.3 Link layer functionality in the p4-researching repository. . . . . . . . . . . . . . . . 496.4 Link layer functionality provided by the Switch.p4 project. . . . . . . . . . . . . . . 506.5 Link layer functionality provided by the Sai_bridge.p4 project. . . . . . . . . . . . . 506.6 Link layer functionality of SONiC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.7 Link layer functionality of ONOS with fabric.p4. . . . . . . . . . . . . . . . . . . . . 52

viii

Page 9: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

1 Introduction

This chapter starts by introducing the motivation behind conducting this thesis project. Thisis followed by a presentation of the aim of the project which is then made more clear througha list of explicit research questions that are answered by the thesis.

1.1 Motivation

The network market and technology for data centers are at the beginning of a transformation.Major cloud players are leading this evolution which involves what is called Network Func-tion Disaggregation (NFD) [41]. The idea behind NFD is to migrate from using proprietaryswitches with closed hardware and software, towards so called ”white box” ones consistingof decoupled, open components.

Open source software will be a vital component in these white box devices. Firstly, itallows for simple extension of functionality without depending on a single vendor. Secondly,it removes the need to trust the vendor that no malicious features like backdoors are hiddenin the device.

A major new property allowed by this transformation is that the behavior of the networkwill be fully programmable and defined with software. This is called Software Defined Net-working (SDN) which specifies that the control of the network is centralized into one or morecontrollers, as opposed to the traditional distributed approach to networking. These con-trollers are sometimes referred to as Network Operating Systems (NOS) [43], and controllinga network in this manner allows for more flexibility and less complexity compared to thetraditional decentralized networks.

The concepts of SDN can even be taken one step further by making the switching devicesthemselves fully programmable. This can be achieved using P4, which is short for Program-ming Protocol-independent Packet Processors. P4 is a high-level programming language whichis used to define the behavior of the data plane in a P4-programmable switch [3]. SDN andP4 go hand in hand and allow for full programmability of the network. However, due to theflexibility of P4 it can of course also be used in the traditional manner.

P4-programmable switches allow the customer to quickly implement new behavior andprotocols, even custom ones, without any dependence on the switch vendor. The languagealso makes it possible to only include the protocols that are actually wanted, thus reducing

1

Page 10: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

1.2. Aim

complexity and enabling for easier verification. In addition to this added flexibility, existingP4-programmable switches on the market are fast, processing packets at speeds of 12.8Tb/s1.

1.2 Aim

The aim of this thesis is to investigate the possibilities for replacing traditional switches withP4-programmable counterparts at the company Saab2.

There are a few aims with this migration. For instance, security is of high importance forSaab and their customers, and therefore open components that can be verified are preferredover proprietary, closed components which requires trusting the vendor. Another desirableaspect of P4 is the freedom to quickly implement new functionality without being reliant onthe switch vendor doing this for you. Additionally, having the option to drop all unusedfunctionality and protocols, as well as being able to show the P4 source code to customers areboth wanted from verification and security perspectives.

In order to perform such a migration, research needs to be done into existing open sourcesolutions for P4-programmable switches. Specifically, this thesis aims to study what link layerfeatures can be replicated by existing solutions, and what needs to be built from scratch. Thisincludes both implementations of data planes in the P4 language, as well as control planesolutions that are compatible with P4.

The next step is to build the data and control plane software for a P4-programmable switchbased on the most suitable open source solutions. This implementation should achieve someof the link layer functionalities that come out of the box in traditional switches. Specifically,the aim is to have working MAC learning and loop prevention in a way that is compatiblewith traditional switches. This is not meant to be a complete replacement for a traditionalswitch, but instead to provide a base that Saab can extend in the future.

An additional point of interest in this thesis is the possibility for partial migration to P4-programmable switches, which means only replacing a subset of the traditional switches inan existing network. The aim with this is to reduce cost in the migration, and this thesisinvestigates ways to build such a hybrid network while keeping benefits of P4 and SDN.

1.3 Research Questions

In this thesis the following research questions will be investigated:

1. What link layer functionality is relevant for switches, and how much of it can be repli-cated in P4-programmable switches using existing open source solutions?

2. How can the data and control planes for P4-programmable switches be built to achieveEthernet forwarding, MAC learning, and loop prevention with the Rapid Spanning TreeProtocol?

3. What approaches exist for partial migration of a LAN from legacy switches to P4-programmable switches? Can such a mixed network still benefit from P4 and having acentral SDN controller?

1.4 Delimitations

This thesis project has a general delimitation to only consider link layer functionality andprotocols for Ethernet switches. As a result of this, the first research question only studieslink layer functionality, and the implementation in the second question is a link layer switch.

1https://www.barefootnetworks.com/products/brief-tofino-2/2https://saab.com/

2

Page 11: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

1.4. Delimitations

The network setups used for testing the implementation will be local area networks (LANs)of a limited number of switches.

The study performed to answer the third research question is of a limited scale, and doesnot include implementing the approaches or comparing them to each other.

3

Page 12: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2 Background

This chapter presents the background theory that this thesis is based on.

2.1 Software Defined Networking

Software Defined Networking (SDN) is a paradigm for networking that separates the controlplane and the data plane of network devices. The control plane is the logic that makes decisionsabout network traffic, while the data plane refers to the forwarding of packets based on thosedecisions. Traditionally, these planes have been vertically integrated, which means that theyboth reside on the same network device. [32]

With the separation of the control plane and data plane, network devices can be simplifiedto only focus on packet forwarding, while the control logic can be moved out of the devicesand become logically centralized as opposed to the traditional distributed approach. Thiscentralized control can be handled by a so called SDN Controller, also known as a NetworkOperating System. While SDN controllers enable logically centralized control of the network,that does not force a physically centralized implementation. In fact, both centralized anddistributed controllers exist. The motivation behind the distributed approach comes fromneeds of better scaling and fault tolerance. [32]

The term SDN is often used in conjunction with the popular OpenFlow API, which issupported by many modern network switches. However, while the term was initially usedfor describing OpenFlow, the ideas behind SDN predates it by a long time, and OpenFlow issimply one realization of SDN. [21]

This fact has relevance in this thesis project, as P4-programable switches generally favourother APIs such as P4Runtime [23] for implementing SDN. Although, it should be noted thatP4 is flexible enough for OpenFlow support to be implemented on top of it1.

1https://github.com/p4lang/switch/blob/master/p4src/openflow.p4

4

Page 13: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.2. Traditional Ethernet Switches

2.2 Traditional Ethernet Switches

The basic operation of an Ethernet switch is defined in the IEEE 802.1D standard [17] coveringMedia Access Control (MAC) bridges. Switches operate at the link layer (layer 2 of the OSImodel) and play an important role in connecting hosts in local area networks. The mainfunctions performed by switches include: packet forwarding, packet filtering, error checking,and prevention of loops. [26]

2.2.1 Basic Forwarding and Filtering

Switches forward packets based on MAC addresses which are present in the Ethernet headerof a packet. This header contains both a source and a destination MAC address, each 48 bitslong. MAC addresses are unique and are associated with specific hosts in a network, withthe exception of multicast and broadcast addresses which refer to multiple hosts. [13]

The forwarding operation is based on a table in the switch that maps MAC addresses toswitch ports. When the switch receives a packet it will perform a lookup on the destinationaddress in this table to determine what port to send to. If a match is found, the packet isforwarded to the port in question. However, if this port turns out to be the same port thepacket was received on, the switch performs filtering by discarding the packet. Finally, if thetable does not contain the destination MAC, the packet is forwarded to all ports except theone it was received on. This is called flooding. [33]

2.2.2 Learning

The table used in the forwarding and filtering process does not have to be configured man-ually. This is because switches have self learning capabilities. This is achieved through thebasic assumption that if a packet is received on a port, then that port can be used to reachthe host with the source MAC address of the packet. This is used to update the table everytime a packet is received by the switch. Since the network topology can change at any time,the table also keeps track of the time each entry was updated. Once an entry becomes oldenough, it is removed from the table and this is called aging. [33]

2.2.3 Broadcasting

A special MAC address which does not refer to any specific host is the broadcast address. Thisaddress is represented by a destination MAC of ff:ff:ff:ff:ff:ff, and it means that thepacket is intended to reach all hosts in the LAN. To handle this, a switch receiving a broadcastpacket should forward it on all ports, except for the one that received the packet. [26]

2.2.4 Multicasting

A second category of special MAC addresses are multicast addresses. A multicast addressbelongs to multiple hosts, and it is up to the configuration on each host to decide if a multicastpacket should be accepted or not. Multicast addresses can be identified by the least significantbit of the destination MAC being set to 1. Note that this is also the case for the broadcastaddress, and collectively these are known as group addresses. [13]

For a switch, packets with a multicast address as destination can be treated in the sameway as broadcast packets are. Namely, to forward it on all ports except the receiving one.However, by listening to Internet Group Management Protocol (IGMP) traffic, the switch candetermine which multicast groups each port is part of. This is called IGMP snooping, andallows the switch to only forward multicast traffic on the ports that need it. [26]

5

Page 14: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.2. Traditional Ethernet Switches

2.2.5 Redundancy and Loop Prevention

Having redundant links in a LAN can be desirable for fault tolerance, but it comes with aproblem which is loops in the network. Consider the network in Figure 2.1, and imagine thatone of the hosts sends a broadcast packet. This packet will be broadcasted by the first switchto both the others, which in turn will continue to broadcast it around the loop, and so on.The result is that the packet continues to loop around the three switches forever, and in bothdirections. As can be imagined, having more than three switches would make this problemmuch worse and would likely result in duplication of packets. This problem is known as abroadcast storm [12].

Figure 2.1: A simple LAN with a loop.

One solution to dealing with redundancy and loops in the network is to use one of thespanning tree protocols. These protocols work in a distributed fashion, so every switch in thenetwork runs the exact same algorithm. As the name suggests, the way they prevent loops isby creating a spanning tree in the network, and by blocking traffic on any link not part of thistree. This works since spanning trees have the property of having exactly one path betweenany two nodes in the tree.

The original version of this protocol is called the Spanning Tree Protocol (STP), and isstandardized in IEEE 802.1D-1998 [18]. Since then it has been further developed into otherversions such as the Rapid Spanning Tree Protocol (RSTP), and the Multiple Spanning TreeProtocol (MSTP). Since the focus of this thesis is on implementing RSTP, the main detailsabout this protocol will be described later in section 2.3.

2.2.6 Virtual LANs

Switches supporting virtual local area networks (VLANs) are standardized in IEEE802.1Q [14], and allow for division of a single LAN into multiple logical ones. This meansthat all switch ports in the LAN are grouped together into different VLANs, each simulatinga physical LAN consisting of only those ports. In order to achieve this, switches make sure toonly forward packets between two ports if they are within the same VLAN. VLANs are com-pletely transparent for the end hosts of the network since the result operates just like physicalLAN would. [33]

A main benefit of using VLANs is traffic isolation. Since packets can be broadcasted, orflooded if the destination MAC is unknown to a switch, a large LAN will contain a lot of suchtraffic. Thus, VLANs can improve network performance by dividing the LAN into multipleones. Another benefit is security, since end hosts are limited to only receive traffic withintheir VLAN, as opposed to any traffic in the whole physical LAN. [33]

6

Page 15: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.2. Traditional Ethernet Switches

VLAN Trunking

Usually it is desirable to spread a VLAN across multiple switches. A naive approach for thiswould be to use multiple cables when connecting two switches, one for each VLAN theyshare. However, this does not scale well and will cause a lot of switch ports to be wasted forthis purpose. [33]

A solution to this is called VLAN trunking, which results in only requiring one connectionbetween the two switches. This connection can carry traffic of any VLAN, and to achievethis IEEE 802.1Q defines a tagged Ethernet format containing a VLAN tag, which includes aVLAN identifier. Tagged frames are only sent between switches, and only between two portsconfigured to be trunk ports. [33]

The tagged Ethernet format defined by IEEE 802.1Q differs from standard Ethernet byadding a four byte tag. The main content of the tag is, as mentioned, the VLAN identifier,but also a field of three bits containing priority information. The VLAN identifier is of 12-bitswhere 1–4094 are used as actual identifiers, while 0 denotes a lack of identifier and 4095 is forimplementation use. [14]

2.2.7 Link Aggregation

Link aggregation (LAG) is a feature that is standardized in IEEE 802.1AX [15]. The purpose ofthe feature is to allow using multiple links between two devices as if it was just a single link.Doing this can achieve a higher capacity between the devices, since traffic is split betweenthe multiple links. Additionally, it introduces redundancy, where if one of the links fails theremaining ones will ensure that the aggregated link stays up. [42]

Link aggregation works by splitting traffic onto the links based on conversations. Twopackets belong to the same conversation if their relative ordering is significant, meaning thatthey must be sent on the same link to preserve it. For example, all packets sent from host Ato host B must keep their ordering intact, but their order relative to other unrelated traffic inthe network has no significance. [42]

The IEEE 802.1AX standard also defines a protocol called Link Aggregation Control Pro-tocol (LACP). The purpose of the protocol is to enable automatic configuration of link ag-gregation. It works by having the switch transmit state information on its links using theprotocol. This allows it and its neighbors to determine which links to aggregate. [15]

2.2.8 Discovery Protocols

The purpose of discovery is for network devices to learn about the existence and capabilitiesof their neighbors in a LAN, and there exist multiple protocols for this2.

One such protocol is the Link Layer Discovery Protocol (LLDP), which is standardizedin IEEE 802.1AB [16]. This protocol works by having switches periodically send LLDP DataUnits (LLDPDU) on each interface. These data units consist of an arbitrary number of type-length-value (TLV) fields, each of a variable length. These fields contain various informationabout the switch and there are three mandatory ones: Chassis ID, Port ID and Time To Live.There are also some standardized optional TLVs, for example System Capabilities and PortDescription. [16]

2https://www.cisco.com/en/US/technologies/tk652/tk701/technologies_white_paper0900aecd804cd46d.html

7

Page 16: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.3. Rapid Spanning Tree Protocol

2.3 Rapid Spanning Tree Protocol

This section contains an overview of the Rapid Spanning Tree Protocol (RSTP) as defined inthe IEEE 802.1D-2004 standard [17].

The protocol was first introduced in IEEE 802.1w-2001, and was later incorporated in802.1D-2004, replacing the original STP in the standard. However, RSTP still maintains back-wards compatibility with STP by adopting the old behavior when detecting a neighboringSTP switch.

The main improvement introduced with RSTP is faster reconfiguration of the spanningtree when changes occur in the network. These changes could for example be the removal,failure, or addition of a link or switch to the network.

As a simplified introduction to the behavior of the protocol, the following points describethe main idea behind how the switches together create a spanning tree throughout the net-work:

1. One of the switches needs to be decided to be the root of the tree.

2. On each switch other than the root, there needs to be one port which is the root port ofthat switch. This is the port that has the lowest cost path to the root switch.

3. For each link, there needs to be one switch port which is the designated port to that link.This is the port that provides the lowest cost path from this link to the root switch.

4. Assuming the previous three points are fulfilled, and once the algorithm stabilizes, thespanning tree will consist of all root ports along with all designated ports in the net-work. In practice, this is achieved by having each switch block traffic on any port thatis not a root port or a designated port.

The reason why the root and designated ports together create a spanning tree is becausethey ensure that each switch, and each link, only has one port towards the root of the tree. Inaddition, this path will be the lowest cost path available.

The protocol is designed to be plug and play, meaning that it is the responsibility of theprotocol to ensure that all switches agree on who is the root, as well as which ports should beroot ports and designated ports. However, the 802.1D-2004 standard also defines configura-tion parameters, which can be updated during deployment, that allows a network adminis-trator to shape the spanning tree as desired.

The following sections will go into more detail about the different parts of the protocoland how it functions.

2.3.1 Bridge Protocol Data Units

In RSTP, each switch only has direct communication with its neighboring switches, and thisis achieved through sending so called Bridge Protocol Data Units (BPDUs).

BPDUs are sent to the special multicast MAC address 01:80:C2:00:00:00, which isan address that is not forwarded by switches. These packets are sent at regular intervals ondesignated ports once every “Hello Time”, which defaults to 2 seconds. They are also sentinstantly if any new information is available, but with a limit on the maximum rate.

Both STP and RSTP use the same structure for BPDUs, and since RSTP is backwardscompatible it needs to be able to handle STP BPDUs as well. As a result of this, there arethree types of BPDUs. STP defines both the Configuration and Topology Change Notification(TCN) BPDUs, while RSTP only has one kind called a RST BPDU. The purpose of the TCNmessages is to alert other switches when a change to the network topology is detected. Thislets the switches forget learned MAC addresses which might have become incorrect. In RSTBPDUs, topology change notification is a flag instead of a separate type.

8

Page 17: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.3. Rapid Spanning Tree Protocol

Table 2.1 shows what information is included in STP and RSTP BPDUs. Note that for aTCN message, only the fields up to and including the “BPDU Type” are used.

Table 2.1: Information carried in STP and RSTP BPDUs.

Field Bytes DescriptionProtocol Identifier 2 Both STP and RSTP set this field to 0Protocol Version Identifier 1 Identifies if the protocol is STP or RSTPBPDU Type 1 Configuration, TCN, or RST BPDUFlags 1 See Table 2.2Root Identifier 8 Identifier of the switch the sender believes to be rootRoot Path Cost 4 The sender’s cost to the root switchBridge Identifier 8 Identifier of the sender switchPort Identifier 2 Identifier of the port the BPDU was sent fromMessage Age 2 The age of the information carried by this BPDUMax Age 2 The age where the information should be discardedHello Time 2 Interval used for the hello timerForward Delay 2 Interval used for the forward delay timerVersion 1 Length 1 Allows for protocol extension, not present in STP

The Flags field in the BPDU has a size of 1 byte and carries the bit flags shown in Ta-ble 2.2. Note that STP BPDUs only recognize the “Topology Change” and “Topology ChangeAcknowledgedgment” flags.

Table 2.2: Bit flags in the Flags field of a BPDU.

Flag BitsTopology Change 1Proposal 1Port Role 2Learning 1Forwarding 1Agreement 1Topology Change Acknowledgment 1

The purpose of these flags is to signal topology changes, and in RSTP they also allow fornegotiation between switches. This allows RSTP to put ports into the forwarding state faster,compared to STP which is forced to wait for timers expiring first.

2.3.2 Agreeing on a Root Switch

For the protocol to function at all, there needs to be a mechanism for all the switches to agreeon which should be the root of the tree. This is solved by requiring each switch to have aunique identifier number. This number is created by combination of two parts, a priorityand the MAC address of the switch. The priority part occupies the more significant bits ofthe identifier, and is a configuration parameter. The MAC address in included to ensureuniqueness.

A general trend throughout the protocol is that lower numbers are considered “better”in comparisons. This is also the case for switch identifiers, and as such, the switch with thenumerically lowest identifier will be accepted as the root of the tree.

2.3.3 Port States

There are three states that each switch port can be in when using RSTP, and these are: For-warding, Learning and Discarding.

9

Page 18: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.3. Rapid Spanning Tree Protocol

In the Forwarding state, packets are forwarded and MAC learning is enabled. In the Learn-ing state, learning is enabled, but no packets are forwarded. Finally, in the Disabled state, theport neither forwards nor learns.

The main goal of the protocol is to decide which state each port should be in to maintainthe spanning tree.

2.3.4 Port Roles

Each port of a switch running RSTP also has a port role. There are five available roles andthese are the following: Designated Port, Root Port, Alternate Port, Backup Port and DisabledPort.

• A Disabled Port is a port that is not in an operational state and can not forward anypackets. This could be due to a failure or by management action.

• The Designated Port role is assigned to ports that provide the best root path cost to thelink they are attached to. There can be multiple designated ports on a single switch, buteach link should only have one. The switch that has the designated port for a certainlink is referred to as the Designated Bridge for that link.

• The role of Root Port on a switch is given to the port that has the lowest root path cost.Each switch has just one root port, except for the root switch which has none.

• If a switch is the designated bridge for a link, and has multiple ports connected to it,then each of these ports except the designated one will have the Backup Port role. As thename suggests, this port acts as a backup for the designated port.

• Finally, any other port gets the Alternate Port role. These ports act as backups for theroot ports of the switches.

As mentioned previously, the port roles and port states have a close relation. The goalis that every root port and designated port should be put in the forwarding state, while anyother port should be in the discarding state. However, changing ports to forwarding too fastwill result in temporary loops. Therefore, in some cases it is necessary to wait for timersexpiring or a negotiation process between neighboring switches first.

2.3.5 Updating Port Roles

As soon as new information is received by a switch, either from a received BPDU or by man-agement action, all its port roles are updated.

The update process starts by deciding which switch is believed to be the root of the net-work, and this simply becomes the switch with the lowest known identifier. This in turnallows the switch to select its best root port, unless it believes itself to be the root of course.The roles for the other ports are then selected as follows.

If the information has aged out on a port, i.e. no BPDUs have been received for a while, orif the switch provides a better path to a link than what received BPDUs are advertising, thenthis port will become a designated port.

However, if the BPDUs received on the port instead advertise a better cost to the link thanwhat the switch itself provides, then it will become an alternate or backup port. Which of thetwo depends on if it is the switch itself that provides the better path or another switch. If it isitself, then it will become a backup port, otherwise an alternate port.

10

Page 19: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.3. Rapid Spanning Tree Protocol

2.3.6 Example Topology

Figure 2.2 demonstrates the resulting port roles and states after the RSTP algorithm has sta-bilized in a small LAN. In the example, switch A has the lowest identifier, followed by B, Cthen D. Therefore, A becomes the root switch.

ABP

DPRP

DPC

AP

D

Forwarding

Discarding

DP: Designated PortRP: Root PortBP: Backup PortAP: Alternate Port

B

DP

RP DP

DP RP

DP

AP

Figure 2.2: Port roles and states in a stable topology of four switches.

2.3.7 STP Compatibility

RSTP has full backwards compatibility with the older STP version. This means that it can beconfigured to work in full STP mode by management. However, even in normal operation,RSTP will listen to any received STP BPDUs. If any such messages are seen, it will enter STPmode on that port, but leaving other ports unaffected. This means that it will stop sendingRST BPDUs, and instead send Configuration and TCN BPDUs on that port. Additionally, italso mimics the full behavior of STP, losing any benefit of RSTP on the port in question.

11

Page 20: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.4. The P4 Programming Language

2.4 The P4 Programming Language

P4, which is short for Programming Protocol-independent Packet Processors, is a domain specificlanguage for programming the data plane of a network switch. The language was first pre-sented by Bosshart et al. [3] in 2014. Two years later in 2016, a new version of the languagewas presented by Buidu and Dodd [4]. To differentiate the two versions, the new one isreferred to as P416, while the original is P414.

The P4 language was developed with three main goals in mind: reconfigurability, protocolindependence and target independence. Reconfigurability refers to the ability for a controllerto update the forwarding behavior in the field. Protocol independence means that it is up tothe programmer to define the format of packet headers, and how these are processed. Finally,target independence is simply the fact that a P4 program can be compiled for different switchtargets, without needing any knowledge of the underlying hardware. [3]

The syntax of the language is based on the C programming language, but with differentconstructs suited for packet processing. These include header definitions, tables, parsers andcontrols, among others. However, the language also has some limitations compared to C,lacking capabilities for: pointers, memory allocation, floating point calculations and recur-sion. Additionally, the only sort of looping in the language is within parser blocks whichwork like a state machine. [4]

The following sections will focus on describing the P416 version of the language and therewill be a running example in the form of code snippets.

2.4.1 Target Architectures

In P416, any target specific details are defined inside of a P4 file, which is provided with thecompiler for the target architecture. These details include target specific externs and interfacesfor the programmable blocks available by the architecture. [8]

An extern is a construct in P416 that defines an API for target specific functionality. Thisallows it to be used inside of programs and could, for example, be a checksum unit which ishard-wired in the switch architecture and thus not normally programmable using P4. [8]

Having different architectures in P416 might seem counter intuitive, considering one ofthe main goals of P4 was target independence. However, the P4.org Architecture WorkingGroup have defined a specification for an architecture called the Portable Switch Architecture(PSA) which is intended to be general enough that many different targets can support it. [24]

Listing 2.1 shows how a P416 program can include the target specific file, in this casev1model.p43 for the V1Model architecture. Note that P414 did not have different architec-tures, and instead defined a single abstract switch model. V1Model intends to be equivalentto this model, but for P416. [9]

The core.p4 file is the P416 core library, which defines some built-in constructs, and shouldbe included by every P416 program. [8]

Listing 2.1: Including the core library and V1Model architecture definition in P416.

#include <core.p4>#include <v1model.p4>

3https://github.com/p4lang/p4c/blob/master/p4include/v1model.p4

12

Page 21: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.4. The P4 Programming Language

2.4.2 Header Definitions

Since one of the goals with P4 was to allow for protocol independence, the language has aspecific construct to define packet header formats. This is done using the header keyword,which lets the programmer define the structure of the headers that the switch needs to recog-nize. Doing this is needed to be able to parse that header in incoming packets.

Listing 2.2: Ethernet and IPv4 header definitions in P416.

header Ethernet_h {bit<48> destAddr;bit<48> srcAddr;bit<16> etherType;

}header Ipv4_h {

bit<4> version;bit<4> internetHeaderLength;bit<8> diffServ;bit<16> totalLength;bit<16> identification;bit<3> flags;bit<13> fragmentOffset;bit<8> ttl;bit<8> protocol;bit<16> headerChecksum;bit<32> srcAddr;bit<32> destAddr;

}

In the example shown in Listing 2.2, the format of the Ethernet and IPv4 headers aredefined in P416. Header types describe the exact layout of a packet header, and can containdifferent base types. For example, bit<48> is a 48 bit long unsigned integer. [8]

In addition to the fields defined by the programmer, header types also have a hidden“valid” bit. This bit is not set by default, but is automatically set when the header is success-fully read from a packet. This allows checking if a header is present in a packet, and alsocontrols if the header should be emitted when forwarding the packet. [4]

2.4.3 Packet Parsing

In P4, it is the programmer that describes how the headers in the incoming packets should beparsed. This is done through the parser construct, which allows expressing a state machinefor how headers are recognized and extracted. The result of the parsing process is that a setof headers are extracted from the packet and these automatically have their “valid” bit set,while the headers that were not extracted remain invalid. [4]

In P416, the input and output parameters of the parser are defined in the target architecturefile, which for example is in v1model.p4 for the V1Model architecture. The parsing begins inthe start state, and switching states is done using the transition keyword. The processfinishes once either one of the accept or reject states are reached. [8]

An example of a parser for the V1Model architecture is shown in Listing 2.3. In the exam-ple, the parser will start by extracting the Ethernet header, and will either accept, or move onto parsing IPv4 depending on the etherType field in the Ethernet header.

13

Page 22: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.4. The P4 Programming Language

Listing 2.3: Parser for Ethernet and IPv4 in P416, targeting V1Model.

struct Metadata {} // User-defined metadata.struct Headers {

Ethernet_h ethernet;IPv4_h ipv4;

}parser ExampleParser(

packet_in packet,out Headers headers,inout Metadata metadata,inout standard_metadata_t standardMetadata

) {state start {

transition parseEthernet;}state parseEthernet {

packet.extract(headers.ethernet);transition select(headers.ethernet.etherType) {

0x0800: parseIpv4;default: accept;

}}state parseIpv4 {

packet.extract(headers.ipv4);transition accept;

}}

2.4.4 Control Blocks

Control blocks are defined with the control keyword, and are used to implement the pro-grammable parts of a switch architecture in P416. Each control has an apply block whichbehaves as an imperative program that can contain various statements and invoke tablelookups. [4]

Tables and Actions

Both actions and tables can be defined inside of a control block, and they are the main way forthe control plane to change the behavior of the data plane. Tables consist of a key definitionwhich describes how entries are looked up in the table, and a set of actions which the table isallowed to invoke. The actual entries of the table are installed by the control plane, includingthe key and which action should be taken, along with what parameters to pass the action. [4]

Keys can include different variables, such as fields in the packet headers or packet meta-data. Each field of the key has a specific match type, and the P4 core library defines thefollowing types: exact match, longest prefix match, and ternary match. [8]

Listing 2.4 shows an example of a control block intended for the ingress stage of theV1Model architecture, which comes after the parsing stage. In the apply block, the con-trol will invoke the ipv4Exact table assuming the packet had a IPv4 header and that thetime to live (ttl) field is sufficient. The ipv4Exact table uses the destination address ofthe IPv4 header to match against, and allows invoking either the drop or forwardIpv4 ac-

14

Page 23: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.4. The P4 Programming Language

tions. The drop action simply drops the packet by calling an extern, while forwardIpv4sets the outbound port, and updates the MAC addresses and the IPv4 ttl. As an exam-ple, this table would allow a control plane to install a table entry that makes the switchroute packets destined to the IP 10.1.2.3 out on port 1 with a destination MAC addressof 01:23:45:67:89:AB.

Listing 2.4: Example control for ingress processing in P416

control ExampleIngress(inout Headers headers,inout Metadata metadata,inout standard_metadata_t standard_metadata

) {action forwardIpv4(bit<48> destMacAddr, bit<9> port) {

standard_metadata.egress_spec = port;headers.ethernet.srcAddr = headers.ethernet.destAddr;headers.ethernet.destAddr = destMacAddr;headers.ipv4.ttl = headers.ipv4.ttl - 1;

}action drop() {

// Extern in V1Model that drops the packet.mark_to_drop(standard_metadata);

}table ipv4Exact {

key = { headers.ipv4.destAddr: exact; }actions = {

forwardIpv4;drop;

}size = 512;default_action = drop();

}apply {

if (headers.ipv4.isValid() && headers.ipv4.ttl > 1) {ipv4Exact.apply();

} else {drop();

}}

}

The V1Model pipeline has more programmable stages than just the ingress. Thefull pipeline, after the parsing stage, consists of the following programmable parts:VerifyChecksum, Ingress, Egress, ComputeChecksum and the Deparser. For exam-ple, the ComputeChecksum stage can be used to recalculate the IPv4 headerChecksumfield, and the Deparser describes how packet headers are emitted from the switch.

Listing 2.5 shows how the ComputeChecksum stage can be implemented using theupdate_checksum extern of V1Model. This is needed if the switch modifies any IPv4 fields,such as ttl.

Another example is shown in Listing 2.6 of a deparser that emits Ethernet and IPv4 head-ers. Note that headers are only emitted if their “valid” bit is set.

15

Page 24: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.4. The P4 Programming Language

Listing 2.5: Example checksum computation for V1Model in P416.

control ExampleComputeChecksum(inout Headers headers,inout Metadata metadata

) {apply {

update_checksum(headers.ipv4.isValid(), // Only if header valid.{ // Calculate with these fields.

headers.ipv4.version,headers.ipv4.internetHeaderLength,headers.ipv4.diffServ,headers.ipv4.totalLength,headers.ipv4.identification,headers.ipv4.flags,headers.ipv4.fragmentOffset,headers.ipv4.ttl,headers.ipv4.protocol,headers.ipv4.srcAddr,headers.ipv4.destAddr

},hdr.ipv4.headerChecksum, // Checksum field.HashAlgorithm.csum16 // Algorithm to use.

);}

}

Listing 2.6: Example deparser in P416.

control ExampleDeparser(packet_out packet,Headers headers

) {apply {

packet.emit(headers.ethernet);packet.emit(headers.ipv4);

}}

2.4.5 The Main Declaration

To combine the different parts into one program, P416 requires the programmer to write amain declaration. This tells the compiler which parser and controls should be used for thedifferent programmable parts of the architecture. [4]

This means that the structure of this declaration will depend on what programmablestages have been defined in the architecture P4 file. An example for V1Model is shown inListing 2.7. Note that SomeVerifyChecksum and SomeEgress are also controls, but havenot been exemplified in any of the code listings. However, for the purpose of the runningexample they can just be empty controls.

16

Page 25: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.5. Behavioral Model V2

Listing 2.7: Main declaration for V1Model in P416

V1Switch(ExampleParser(),SomeVerifyChecksum(),ExampleIngress(),SomeEgress(),ExampleComputeChecksum(),ExampleDeparser()

) main;

The resulting P4 program from these code listings (2.1–2.7) describes a data plane that actsas a simple router. It is only capable of very basic IPv4 forwarding, but it still showcases mostbase features of the language.

2.5 Behavioral Model V2

The Behavioral Model V2 (BMv2) is a framework for testing and debugging compiled P4 pro-grams in software. BMv2 includes the simple_switch4 target, which is a software switchcapable of running P414 programs as well as P416 programs targeting the V1Model archi-tecture. This switch is only intended for development purposes, and using it as a softwareswitch in production is not recommended5.

To compile P4 code for BMv2 there exist two open source compilers: p4c6 and p4c-bm7.The p4c compiler is the most modern of the two and can compile both P416 and P414 code,while p4c-bm only compiles P414.

The interface between a control plane and the simple_switch target is defined by aThrift8 Remote Procedure Call (RPC) API. This allows the control plane to, for example, ma-nipulate the tables of the running P4 program. There also exists another target in BMv2 calledsimple_switch_grpcwhich uses the more modern and standardized P4Runtime [23] API.

2.6 Mininet Network Emulator

Mininet9 is a Python platform that allows emulating networks on a single computer. Thisincludes hosts, switches, controllers and links between these, and Mininet allows running anyLinux programs on the different nodes. Mininet includes a Python API which allows definingcustom network topologies as well as running the emulation and starting a Command LineInterface (CLI) for manipulating the network. [34]

While running the emulation, the network can be interacted with through the CLI. Thisincludes being able to run commands on hosts, such as pings, as well as bringing switchesand links up or down. As an example, to ping from host h1 to host h2, one would inputh1 ping h2 into the CLI. Alternatively, a terminal could have been opened on h1 throughxterm h1, to allow full access to a Linux shell on it. Similarily, to bring down the linkbetween switch s1 and switch s2, the command link s1 s2 down would be used.

Mininet works through the use of network namespaces and virtual Ethernet pairs, both pro-vided by the Linux kernel. Network namespaces are used for the hosts and switches inMininet, giving them their own set of Ethernet interfaces and ARP/IP routing tables. The

4https://github.com/p4lang/behavioral-model/blob/master/docs/simple_switch.md5https://github.com/p4lang/behavioral-model6https://github.com/p4lang/p4c7https://github.com/p4lang/p4c-bm8https://thrift.apache.org/9http://mininet.org/

17

Page 26: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

2.7. Apache Thrift

interfaces of these hosts and switches are connected together through the use of virtual Eth-ernet pairs. [34]

2.7 Apache Thrift

Apache Thrift10 is a library for building Remote Procedure Call (RPC) clients and servers indifferent programming languages. As the name suggests, RPC allows a client to remotelycall procedures on a server, as if they were local. The goal behind the Thrift project is toprovide reliable and efficient communication between two programs written in different lan-guages. [36]

With Thrift, all data types and procedures are defined in a single file with the languageneutral Thrift definition language. This file is then compiled by the Thrift compiler into gener-ated RPC code in the desired programming language. [36]

Thrift is used on several occasions for communication in the implementation for the sec-ond research question, which is why it is briefly introduced here.

10https://thrift.apache.org/

18

Page 27: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

3 Related Work

This chapter presents previous work that is related to this thesis project and its research ques-tions.

3.1 DC.p4: Programming the Forwarding Plane of a Data-Center Switch

In a paper by Sivaraman et al. [45], they present a P4 implementation of the data plane for adata center switch. The implementation is called DC.p4 and aims to compete in functionalitywith switches commonly found in data centers at the time of the study, which was in 2015.The goal with the study was both to produce a working P4 data plane for data centers, butalso to evaluate and improve the P4 language.

Sivaraman et al. describe the functionality of the resulting implementation (DC.p4) in atable, and the main ones listed are the following: MAC learning, VLANs, packet validation,STP port states, equal-cost multi-path routing (ECMP), IP forwarding, link aggregation, ac-cess control list (ACL) for MAC and IP, and tunneling with technologies such as NVGRE,VXLAN and ERSPAN.

The authors also mention how improvements to the P4 language were made as part ofthe study, in order to achieve all desired functionality. Many of the features could already bemodeled with the existing P4 language definition. These included IP forwarding, STP portstates and packet validation. However, to achieve the remaining ones, changes were made tothe language.

In order to add support for VLAN tags, Sivaraman et al. had to add header stacks, whichare variable length headers, to the language. This was since there can be a variable numberof VLAN tags in a single packet. They also added some new actions, mainly: packet cloning,packet dropping and digest generation. The packet cloning and dropping were needed forACL, while digest generation was used for MAC learning. These new language features arenow part of the P4 language specification, as a result of the study.

Sivaraman et al. also explained another potential feature that is not currently modeledby the P4 language, but which has relevance for data center switches. This is control overhow packets are scheduled, which would require viewing multiple packets at the same time.Adding this could improve performance in data centers, and the authors argue that an exten-sion to the language should be considered.

19

Page 28: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

3.2. ARP-P4: deep analysis of a hybrid SDN ARP-Path/P4Runtime switch

The implementation by Sivaraman et al. relates to both the first and second research ques-tions in this thesis. This is since it is open source1, and is a data plane switch implementationin P4. However, no control plane for the switch is presented by the paper which is a majorpart of the implementation performed in this thesis.

3.2 ARP-P4: deep analysis of a hybrid SDN ARP-Path/P4Runtime switch

In a paper by Martinez-Yelmo et al. [38] the authors analyse and describe their ARP-P4 switchimplementation, which was presented in an earlier paper [37] by the same authors. Theydescribe the ARP-P4 switch as a hybrid switch in the sense that it supports two ways offinding link layer paths through a switched network.

The first approach used is based on SDN and having a centralized controller throughthe use of P4Runtime, and the second uses the protocol ARP-Path which is a distributedprotocol. The authors describe ARP-Path as an exploration protocol for finding the shortestpaths within a network. The protocol is based on ARP requests, which are broadcasts, tocontrol the source MAC learning of the switches. The switches are programmed to only learnthe link where the first copy of the broadcast is received, and the authors explain that thisis achieved by temporarily dropping any subsequent ones. The protocol also provides loopprevention and is an alternative to the spanning tree protocols.

Martinez-Yelmo et al. describe that the idea behind having both a centralized SDN con-troller and the distributed ARP-Path is to allow for configuration of, for example, access con-trol lists centrally, while still allowing the forwarding to be autonomous. This is done bygiving any rules installed by P4Runtime from the SDN controller priority over the ARP-Pathbehaviour.

When designing the ARP-P4 switch, specifically the table for storing learned MAC ad-dresses, the authors explain that a P4 program can not directly update its own tables. Insteadit must rely on some kind of control plane to do this. However, they decided against this ap-proach and instead defined a non-standard extern in P4, which they were able to do sincethe target switch was the behavioral model v2 (BMv2), which is a software target for P4. Thisallowed them to update the table without involving any control plane.

To evaluate their implementation, Martinez-Yelmo et al. used the Mininet platform anda legacy ARP-Path software switch to compare against. The result was that the performancedifference was insignificant at lower loads of around 10%. However, when approachinghigher loads of 40% the ARP-P4 switch performed significantly worse. One possible rea-son outlined by the authors is that the BMv2 target does not process incoming packets inorder of arrival, and that this causes the ARP-Path exploration to not find the fastest paths.They also tested the behaviour of ARP-P4 for correctness, and this was done by writing testsusing the Packet Test Framework (PTF) which interfaces with P4Runtime.

The ARP-P4 project shares similar goals to the implementation in the second researchquestion of this thesis. This is since it is a switch implementation written in P4 that achievesloop prevention. However, some main differences are that the implementation in this thesisuses the Rapid Spanning Tree Protocol for loop prevention, and not ARP-Path. It also doesnot rely on any non-standard externs, and instead use a local control plane for each switch.

1https://github.com/p4lang/papers/tree/master/sosr15/DC.p4

20

Page 29: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

3.3. Hybrid SDN Architectures

3.3 Hybrid SDN Architectures

This section presents studies covering architectures for hybrid SDN networks, which are ofinterest for the third research question. These papers mainly target switches controlled by theOpenFlow API, while the interest in this thesis is on P4-programmable switches. However, itis the ideas regarding how SDN switches can interact with legacy switches that are of interest,and these are not specific to OpenFlow.

3.3.1 Constructing an Optimal Spanning Tree Over a Hybrid Network with SDNand Legacy Switches

In their paper, Wang et al. [47] discuss a scheme for constructing optimal spanning trees innetworks containing both legacy and SDN switches. The networks in question are hybridnetworks which are divided into a number of SDN and legacy islands, where the switches inan SDN island are surrounded only by legacy switches and vice versa.

The problem at hand, which is packet broadcast storms, has solutions in pure legacy orSDN networks. In legacy networks this can be solved with spanning tree protocols, whilein the SDN case, the controller can prevent loops with its global knowledge of the network.However, the authors describe that to solve this problem in a hybrid network, a new schemeis necessary.

The strategy presented by the authors constructs an optimal spanning tree in a hybridnetwork with regards to two performance metrics. The first is the all-pair average path la-tency, and the second is the all-pair average path throughput. These metrics can be weighteddifferently, depending on the user’s preference.

In order to construct the spanning tree, the authors argue that there are two problems thatneed to be solved. The first is the case where there are multiple links between two islands,and the second is when loops exist at the island level.

The first problem is detected and solved by listening to bridge protocol data unit (BPDU)packets in the SDN islands. These packets are generated by the legacy switches as part of thespanning tree protocol, and with this information the SDN controller can detect all links thatconnect legacy islands with SDN islands. With this knowledge, it just has to select one linkthat should be kept and disable all others between the islands.

To solve the second problem, each island is considered as a single node. This means thata spanning tree can be created on island level, thus removing any loops between islands.For such a spanning tree to be created, the island network topology must be known. Similarto the solution for the first problem, this is also achieved by having SDN switches listen toBPDU packets.

In order to achieve an optimal spanning tree, the authors perform an enumeration of allpossible spanning trees on the island level. For each of these trees, grades are calculatedfor the two metrics. This is then used to determine the optimal spanning tree, based on theweightings of the metrics.

3.3.2 Panopticon: Reaping the Benefits of Incremental {SDN} Deployment inEnterprise Networks

The paper by Levin et al. [35] presents Panopticon, a network architecture for incrementalmigration from legacy to SDN switches in enterprise networks. The main idea behind Panop-ticon is to make an abstraction of a hybrid network, consisting of legacy and SDN switches,into a “logical SDN”. This allows the SDN controller to view the network as if it only consistsof SDN switches.

The authors describe that the main insight behind the architecture is that SDN capabili-ties can be achieved for any path that passes through at least one SDN switch. Ports in the

21

Page 30: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

3.3. Hybrid SDN Architectures

network where this is desired are called SDN-controlled (SDNc) ports, and achieving thisproperty is referred to as waypoint enforcement.

For the first step in solving waypoint enforcement, Levin et al. introduce a few concepts.The first is cell blocks, which are the islands of legacy switches obtained if all SDN switches inthe network were to be removed. The second is the frontier of a cell block. This is the set ofSDN switches adjacent to a certain cell block. Finally, these two are used to define the SolitaryConfinement Tree (SCT) for a SDNc port. This consists of a spanning tree within the cell blockfor the SDNc port, together with links between the cell block and its frontier. In practice, SCTsare constructed with the use of VLANs. Each SDNc port has a unique VLAN ID within itscell block, and the spanning tree is handled by enabling per-VLAN spanning tree protocol inthe legacy switches. The result of these SCTs is that traffic can not go through any SDNc portwithout also passing a SDN switch in its frontier.

With SCTs introduced, the authors go on to explain the process of packet forwarding inPanopticon. For this there are two cases to consider. The first is when there is a shared SDNswitch in the frontiers of the source and destination ports. In this case the SDN switch simplyhas to pass the packet into the destination SCT by rewriting the VLAN ID. The second caseis when no SDN switch is shared. For this, Inter Switch Fabric (ISF) paths are used. These arepaths between SDN switches, and can for example be realized using VLANs. Each path willthen consume one VLAN ID in each cell block it crosses.

The result of Panopticon is that an abstraction can be created, where each SCT will rep-resent wires between the SDNc port and the SDN switches in the SCT. As for the ISF paths,they will be represented by wires between two SDN switches in the abstracted view.

Levin et al. also cover supporting traffic between two legacy ports in the network. Thiscan be allowed by a functionally equivalent implementation of STP on the SDN switches.

3.3.3 Telekinesis: Controlling Legacy Switch Routing with OpenFlow in HybridNetworks

In a paper by Jin et al. [29] the network management framework Telekinesis is presented.Telekinesis aims to enable fine grained control of a hybrid network consisting of both legacyand SDN switches. What differs Telekinesis from other approaches is, according to the au-thors, that Telekinesis allows for more control of the paths through the legacy parts of thehybrid network.

Jin et al. explain that Telekinesis is based on two assumptions about legacy switches. Thefirst is that they use MAC learning, and the second is that they perform some kind of loopavoidance, such as STP. They also choose to refer to the tree of legacy switches resulting fromloop prevention as the network underlay. With this in mind, the main idea behind Telekinesisis to use MAC learning in legacy switches in order to manipulate them. They provide thefollowing example for how this manipulation is performed:

“For example, if we want to modify the action of a routing entry for MAC m from‘send to port p1’ to ‘send to port p2’, we create a packet whose source address ism and make sure it arrives at the switch on port p2. The MAC learning algorithmsees the packet arriving on p2 and assumes its source address m is reachable onp2, therefore updating the corresponding forwarding entry.”

The goal presented in the paper is to allow for changing the path between two end hostswith the use of this manipulation. The authors start by defining which switches need to beupdated when performing such a path change. Firstly, this includes any switch which ispart of the new path, but not the old. Secondly, any switch on which the new and old pathsdiverge or converge will also need to be updated.

The next step is to verify that it is possible to enforce the new path. For this Jin et al. startby defining the term update subpath, which is any sequence of adjacent switches that all needto be updated. They then present the two conditions that need to be satisfied for Telekinesis

22

Page 31: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

3.4. Previous Studies of Hybrid SDN

to be able to update the path. The first is that all legacy links on the new path need to be partof the network underlay, or have an SDN switch as neighbor. The second condition is that foreach update subpath, at least one of the switches must be an SDN switch.

To perform the actual path change, Telekinesis lets the SDN switches create and sendgratuitous ARP packets to manipulate the MAC learning tables of any legacy switch thatneeds updating. The authors note that they use ARP packets to limit any side effects, but thatthe only requirement is that the packet has the correct source MAC address. They also bringup an issue with this approach, which is that the whole path does not update atomically. Thiscan result in what they call path flapping, where late packets between the end hosts are onthe wire during the path change, and revert the updates. To mitigate this, Telekinesis willcontinue the manipulation until the path stabilizes.

3.3.4 HybridFlow: A Lightweight Control Plane for Hybrid SDN in EnterpriseNetworks

Huang et al. [27] present the architecture behind HybridFlow, a control plane for hybrid SDNnetworks. The main idea behind HybridFlow is to provide an abstraction of the switches inthe network that hides all legacy switches and instead provides a view consisting of logicalSDN switches. The goal with this is to allow SDN controllers to work with the abstractedview, without knowing about the existence of the legacy switches. HybridFlow thus acts as alayer between the SDN controller and the network.

The authors explain that the switches in the hybrid network are divided into clusters, eachcontaining at least one SDN switch and any number of legacy switches. Each of these clustersand the ports of its switches are then abstracted into a logical SDN switch. This results in aview of the network with only these logical SDN switches. However, the authors do notdiscuss any details about what a good cluster division is or how it should be performed.

In order to provide the abstracted view of logical SDN switches, the authors describethat HybridFlow performs the following three functions: port allocation, rule generation andmessage transformation. Port allocation is the act of mapping between the logical ports in theabstraction and the physical ports in the cluster. Rule generation involves making sure thatany traffic that enters a cluster is passed through the SDN switch of that cluster. The au-thors describe that this can be achieved using access control list (ACL) functionality of legacyswitches. Finally, message transformation refers to the need for HybridFlow to transformOpenFlow messages between the controller and SDN switches. This is since the controlleronly knows of the logical SDN switches, which do not reflect the physical ones.

The paper by Huang et al. does not go into detail about the implementation of the archi-tecture, but gives an overview of the design and ideas behind it.

3.4 Previous Studies of Hybrid SDN

There have been previous studies exploring existing techniques and challenges for hybridSDN networks. In a paper by Amin et al. [1], they present a survey of the hybrid SDN re-search field. The survey covers existing research of hybrid SDN in the following five cate-gories: deployment strategies, controllers, network management, traffic engineering, alongwith testing/verification and security. In a different survey by Sinha et al. [44], models for hy-brid SDN networks are researched and compared based on properties like investment, trafficmanagement, automation and scalability. Additionally, specific approaches for each modelare presented. Huang et al. [28] present a survey focusing on deployment and optimizationstrategies of hybrid SDN. In the study, different models are analyzed and solutions for thedata and control planes are presented.

These previous surveys are related to the third research question of this thesis. The maindifference with the study performed in this thesis is that it focuses on migration to SDNwith P4-programmable switches. This allows for discussion of the approaches based on P4

23

Page 32: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

3.4. Previous Studies of Hybrid SDN

data planes, and control planes compatible with these. Meanwhile, the previous studies arebroader, and mainly assume the usage of the OpenFlow API.

24

Page 33: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

4 Method

This chapter describes the methodology that was used for answering the three research ques-tions of the project.

4.1 Literature Review

A literature review was performed at the start of the thesis project. The goal of this reviewwas to gain a better understanding about the project area and its current state, mainly the P4language and hybrid SDN. Later during the project, this study was revisited to answer thethird research question.

The report Guidelines for performing Systematic Literature Reviews in Software Engineering byKitchenham et al. [31] was used as a guide in conducting the study.

Finding the literature was done using the academic database Google Scholar. This wasinitially done through searches including keywords like “P4”, “switch”, “SDN”, “migration”and “hybrid”. After a relevant paper was found and read, its references to other paperswere used to continue the search. Additionally, the function that allows finding papers withcitations to the current one was also used.

Determining what literature was of relevance was based on the three research questions.This meant that the aim was to find literature concerning the following: switch implemen-tations using P4, approaches to hybrid SDN networks, as well as other general studies con-cerning hybrid SDN.

Summaries of the most relevant papers, together with some notes on their relation to thisthesis, were written down and can be found under Chapter 3.

4.2 Setup of Development Environment

To be able to conduct the research, a development environment for P4 had to be set up. Thegoal with this was to be able to develop and build all found P4 projects, as well as being ableto simulate a network of P4 switches in software.

The exact development environment and procedure used for setting it up was docu-mented and can be found under Appendix A.

25

Page 34: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

4.3. Studying Open Source P4 Solutions

4.3 Studying Open Source P4 Solutions

In order to be able to answer the first research question, three steps were taken. First thelink layer functionality of interest in an Ethernet switch was defined. Then, a search was per-formed to collect a list of existing open source projects. Finally, these projects were evaluatedbased on the defined functionality to provide a result to the research question.

4.3.1 Defining Functionality

As described in Delimitations (Section 1.4), this thesis focuses on link layer functionality.Since the goal of the first question is to replicate functionality that comes out-of-the-box withlegacy switches, the first step was to research what functions a legacy switch is expected toprovide.

This information was collected through a few different sources. The first was throughreading definitions of Ethernet switches in books covering computer networks. In additionto the books, a search was made for standards covering MAC and Ethernet switches, andthe sections describing requirements for conformance to the standard were read. Lastly, de-tails about legacy switches were also found in related work read during the literature review,specifically in those about hybrid SDN approaches. With this done, a list of link layer func-tionality that can be expected from an Ethernet switch was defined.

4.3.2 Finding Open Source Projects

The second step in answering the first research question was to search for and list any ex-isting open source project that could assist in achieving the desired functionality on a P4-programmable switch. Three different online services were used to aid finding such projects.

Findings from Literature Review

The literature review performed (see Section 4.1) involved searching for and reading throughpapers related to P4 implementations and the P4Runtime API. This was done using the ser-vice Google Scholar and resulted in gaining knowledge of some projects which were relevantto this search. These papers and projects were revisited again during this phase of the project.

Github’s Advanced Search

The next strategy used for finding more projects was the advanced search function on the ser-vice GitHub1, which is a popular site for open source projects. This function allows searchingbased on a range of different parameters, such as keywords, programming languages and fileextensions.

Searches were made with a mix of keywords such as “P4”, “P4Runtime”, “switch” and“SDN”. They were also done both with and without specifying P4 as language and file exten-sion. Since the number of search results found were of a reasonable size, each project couldbe visited and looked at manually to determine its relevance.

Web Searches

This search method was mainly targeted at finding SDN controllers with P4 support as wellas potential P4 project outside of GitHub. This was done using the search engine Google.

First, a search for SDN controllers was made, with the goal of finding existing open sourceones. Then, additional searches were made including the name of each found controllertogether with the keywords “P4” and “P4Runtime”. This allowed finding controllers withsupport for P4-programmable switches.

1https://github.com/search/advanced

26

Page 35: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

4.4. Implementing a P4 Switch With RSTP

Finally, a similar search was done as the one performed on GitHub, described in the pre-vious section, but only including the keywords.

4.3.3 Evaluating Projects

The final step of answering the first question was to evaluate each of the found projects. Thisevaluation was primarily based on the defined functionality as well as some general qualitiesof each project. The evaluation aimed to, for every project, draw the following conclusionsfor all the defined functionality:

• Is the functionality implemented for the data plane or control plane, or both?

• To what extent is it implemented? For example: is it already a complete solution, or justa partial example?

In addition to this, general conclusions were also noted about the projects, including thefollowing information:

• What P4 language version is used, P414 or P416?

• Is the project active, and how well is its usage and implementation documented?

• For data-plane only solutions, what control plane support exists?

• For control plane only solutions, what support exists for P4 specific functionality?

To draw these conclusions, three steps were taken for each project. The first was to searchfor and read any available documentation of the project. The second step was to downloadand build the project, and to run any test cases or examples that were available. The final stepwas taken if necessary due to a lack of documentation. This involved reading through andexperimenting with the source code of the project in order to draw conclusions.

4.4 Implementing a P4 Switch With RSTP

To answer the second research question, both the data and control planes for a P4 switch hadto be built, and the Rapid Spanning Tree Protocol was to be implemented according to IEEE802.1D-2004.

This question had a dependency on the first research question, for two reasons. The firstwas to verify that such a P4 switch was not already available as open source. If that was thecase, the second research question would have been altered. The second reason was to avoidhaving to perform the implementation from scratch. This was to save time by building onexisting solutions for the data or control plane.

4.4.1 Building on Open Source Solutions

After investigation of the first research question, it was realized that there currently did notexist any open source solutions that could provide working RSTP. As such, the second re-search question was still relevant and the next step was to decide if any of the projects couldbe used as a base to build upon.

Data Plane

For the data plane, there were two main candidates to be used as a base. These were theSwitch.p4 and Sai_bridge.p4 projects (See 6.2.2 and 6.2.3). This resulted in a decision betweenthree options, with the third being to write the data plane from scratch.

One disadvantage shared by both these projects was that they are written using the olderP414 version of the language. However, there was not much else advocating for the “from

27

Page 36: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

4.4. Implementing a P4 Switch With RSTP

scratch” option, since it was estimated that building on either project would save significantimplementation time. It was therefore dismissed as an option.

In the remaining choice, Switch.p4 was chosen based on the results of the first researchquestion. Both projects had similar data plane functionality, including support for settingRSTP states on switch ports. Both also support a higher level API, such as the Switch Ab-straction Interface2 (SAI), for a control plane to use. The main reason for choosing Switch.p4was due to it already having fully working MAC learning, as well as a bigger set of test casesthat demonstrated its use. Switch.p4 also included another API for the control plane calledSwitchAPI (see 6.2.2).

Control Plane

After Switch.p4 had been selected to be used as data plane, three possible options were iden-tified for implementing the control plane. These were the following:

1. Using SwitchAPI with a custom control plane.

2. Using the SAI API with a custom control plane.

3. Using SAI with the SONiC controller (see 6.2.4), and by extending SONiC with RSTP.

There was little motivation behind building upon SONiC for the second research question.The reason for this was that SONiC, at the time of this thesis project, had no support for anyspanning tree protocol, and due to MAC learning already being achievable with Switch.p4alone. There were potential benefits with SONiC, mainly that it provides other functionalitywhich could be desired in the future by Saab. However, due to the increased developmenttime estimated for understanding and extending the big project, it was dismissed as an op-tion.

Therefore it was decided to implement a custom control plane, and SwitchAPI was usedfor this. The main reason for using SwitchAPI was because of its lower complexity, whilestill exposing all necessary data plane control that would be needed by RSTP. Since the SAIimplementation in Switch.p4 builds upon SwitchAPI, it would have required more effort toinvestigate and fix problems encountered in the project if SAI was used.

4.4.2 Programming Language for the Control Plane

The final major decision that was taken before starting on the implementation was whichprogramming language to use.

Since SwitchAPI is a C library, one natural option would be to implement the customcontroller in C or C++. However, SwitchAPI also includes a Thrift RPC server (see 2.7). Thisserver allows calling of API functions from other languages using a Thrift client, and the useof this was demonstrated through test cases of Switch.p4 which were written in Python.

Before making this decision, a simple control plane that only initiated SwitchAPI was setup in both C++ and Python. From this it was concluded that either language would work.Therefore the choice was mainly a matter of preference, and Python was chosen due to beingpreferred by the supervisor at Saab.

Since SwitchAPI generates Python 2 code for use with Thrift, the control plane was alsorequired to use Python 2 to be able to import this. However, the control plane code wasattempted to be kept compatible with Python 3 where possible, to aid future transition.

2https://github.com/opencomputeproject/SAI

28

Page 37: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

4.4. Implementing a P4 Switch With RSTP

4.4.3 Implementation Process

There were three main steps in implementing the complete P4 switch. First, a controller withthe original Spanning Tree Protocol (STP) was implemented to control the Switch.p4 dataplane, which ran on the BMv2 software switch. During this process, patches with bug fixesand improvements to switch.p4 were also done. The next step was to replace the implemen-tation of STP with one of the Rapid Spanning Tree Protocol (RSTP). Finally, the control planewas ported to work with another data plane which runs on a real hardware switch.

Patching Switch.p4 and SwitchAPI

While the Switch.p4 project contained tests which exercise the STP APIs of SwitchAPI, thesewere fairly limited compared to how a full protocol implementation interacts with the API.Therefore, a fair few bugs were found in the SwitchAPI C code, as well as the P4 code.

These issues were fixed as properly as possible, but in some cases with no documentationof original intended behavior available, the issues were fixed in a way suited for this thesis’use case.

Starting With STP

While the goal of the second research question was to implement RSTP, the developmentstarted with the original STP algorithm as defined in IEEE 802.1D-1998. There were a fewreasons behind this choice, and they were the following:

1. RSTP builds upon STP and is backwards compatible with it. As such, understandingSTP helps with understanding RSTP as well.

2. STP is of a lower complexity than RSTP.

3. The STP algorithm is defined through a concrete C reference implementation, whileRSTP is defined through a set of state machines and natural text.

Because of these reasons, STP was deemed to be significantly easier to implement. Thisallowed using the STP implementation to verify that the data plane was working as expected,and that all interactions between the control and data planes worked. This was desirablebefore starting work on the more complex RSTP standard.

While a large part of this implementation consisted of porting the reference C code toPython, it also involved general work unrelated to STP. These were fixes to bugs in the dataplane, and a framework for the controller to interact with the switch.

Continuing With RSTP

Once the STP implementation was in a state where it showcased that the protocol was func-tioning, the development moved on to implementing RSTP.

Some additions to the framework had to be made, such as adding the possibility of send-ing RST BPDUs. However, the main chunk of the work consisted of following the definitionsof RSTP in IEEE 802.1D-2004, which mainly uses a set of state machines to describe the be-havior of the protocol. This could eventually replace the previous STP logic.

During this process, it was realized that IEEE 802.1D-2004 contains errors and ambigui-ties. Examples of such include logical expressions without explicit precedence, infinite loops,reading of undefined variables and calling functions that take parameters without specifyingwhat is passed. In many of these cases, it was fairly trivial to still realize the intentions ofthe standard. However, in some cases help was taken from the more modern standard, IEEE802.1Q-2018, which defines the Multiple Spanning Tree Protocol. This protocol is compatiblewith RSTP and shares many similarities to it. It could therefore be used for hints.

29

Page 38: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

4.4. Implementing a P4 Switch With RSTP

Porting to a Hardware Switch

At this point the implementation had only been run on the BMv2 software switch, which isintended for testing and debug purposes. Since the goal was to run on real hardware, Saabprovided a Stordis BF2556X-1T [46] switch with a P4 programmable Tofino3 chip. For thisswitch there was a preference by Saab to use another proprietary project instead of the opensource Switch.p4 project.

This resulted in having to port the Python controller to use a different API than SwitchAPI.However, everything else about the control plane could be kept intact, and it was only theframework for interacting with the data plane that had to be updated. For example, the codefor setting the RSTP port states of the switch. As such, the porting work required a relativelysmall amount of time.

Creating a Command Line Interface

IEEE 802.1D-2004 defines a set of management operations for RSTP that gives an administra-tor control over the protocol. This allows both reading and setting parameters, such as theswitch and port priorities for example.

To implement this part of the standard, a small Command Line Interface (CLI) was madein Python which connects to the controller of a switch to read and set its configuration. Inorder to allow for this, a general configuration server was first developed and included in thePython controller.

4.4.4 Building a Test Environment

To verify that the complete implementation worked as expected, a virtual test environmentwas set up that could run on any computer. The goals with the setup were the following:

• Being able to simulate a network consisting of multiple switches.

• Having the possibility to simulate links and switches going down and coming back up.

• Being able to include other implementations of STP and RSTP in the simulations.

The tool that was chosen for this was Mininet (see Section 2.6), which is a popular platformamong researchers for emulating networks on a single computer [11]. Mininet was also usedby Martinez-Yelmo et al. in testing their ARP-P4 implementation (see Section 3.2). UsingMininet allows both to emulate a network consisting of multiple hosts and switches, but alsoallows disabling and enabling switches and links during testing through its CLI.

To be able to run multiple instances of the Switch.p4 data plane on a single computer,some additions had to be made to it. As an example, this included making the ports it usesconfigurable.

The Ubuntu package bridge-utils was used for testing the RSTP implementation againstanother implementation. Bridge-utils allows setting up switches in Linux that include an STPimplementation, and these can also be used inside of Mininet. However, they do not haveRSTP support and therefore another project called Mstpd4 was also used. Mstpd implementsMSTP on top of bridge-utils, but can also be put in RSTP mode.

Mininet Visualization

Because of the difficulties in verifying correct protocol behavior through the reading of logfiles from multiple switches, a graphical tool was developed.

3https://www.barefootnetworks.com/products/brief-tofino/4https://github.com/mstpd/mstpd

30

Page 39: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

4.5. Testing the P4 Switch Implementation

This tool visualizes the whole Mininet network and allows seeing which switches arerunning and which links are currently up. It also shows the current spanning tree based onthe port states of the switches.

4.5 Testing the P4 Switch Implementation

This section describes how the P4 switch was tested to verify correct behavior of its forward-ing, MAC learning and RSTP implementation. Some testing was performed continuouslyas part of development, while a more extensive test was performed at the end. Most effortwas spent on testing the RSTP implementation, due to it being the most complex part of theswitch.

Both the BMv2 and Stordis versions of the project were tested. This was done usingMininet for the BMv2 version and a physical network setup for the Stordis version.

4.5.1 Inspecting Traffic With Wireshark

A general tool which was used in all the testing is the packet analyzer Wireshark5. Wiresharkperforms live capturing and decoding of packets received or sent on an interface. Duringtesting this was used to verify that the correct packets were being sent, and that their contentsmatched expectations. Wireshark presents the contents of packets in a format friendly forhuman reading, and an example of inspecting a sent RST BPDU is shown in Figure 4.1.

Figure 4.1: Inspection of BPDU contents using Wireshark.

4.5.2 Testing Forwarding and MAC Learning

Testing the forwarding and MAC learning functionality of the implementation was donethrough pings between two Linux hosts in the LANs. Pinging results in the sending of pack-ets using the ARP and ICMP protocols, and these were inspected in Wireshark to verify cor-rect forwarding behavior of the switches.

Since ARP packets use the MAC broadcast address, these were used to test the broadcast-ing functionality. The ICMP requests and replies use unicast MAC addresses, which made itpossible to verify that flooding, learning and aging worked as expected. This was done bychecking that the initial ping traffic was flooded across the network, but that following pack-ets were only sent out on the single learned ports. Then the ping was stopped to wait outMAC aging, after which the test was repeated to verify that all previously learned addresseshad expired.

5https://www.wireshark.org/

31

Page 40: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

4.5. Testing the P4 Switch Implementation

4.5.3 Testing RSTP in Mininet

The test environment that had been developed was used as the main method for testingthe RSTP implementation. This was due to its higher flexibility and convenience comparedthe physical network of Stordis switches. However, note that both the BMv2 and Stordisversions share the same Python controller and consequently also RSTP implementation, withonly minor adjustments.

Mininet Topologies

The network topologies used in the Mininet tests consisted of 6 switches, with links betweenthem to form a complete graph. Three of the switches also had a host connected. See Fig-ure 5.4 in the Implementation chapter for a visualization of this topology.

The idea behind using a complete graph was that any unwanted links could be broughtdown during the test using the Mininet CLI. Similarly, if testing using less than 6 switcheswas desired, any unwanted switches could also be stopped using the CLI.

Two switch types, P4 and Linux (see Section 5.7.1), were used and each can be put inRSTP or STP mode. This resulted in the topologies being divided into three categories: onlyP4 switches, only Linux switches, and a mix of P4 and Linux switches. Each category in turnhad three topologies: RSTP only, STP only, and a mix of RSTP and STP. As such, 9 topologieswere created for testing, but note that the topologies with only Linux switches were only usedas references to compare behavior.

Test Procedure

The actual testing was performed by starting the Mininet emulator with one of the 6 topolo-gies, and then performing the tests manually. The Mininet CLI was used to simulate switchesand links going up or down, and RSTP configuration was changed during run time using themanagement CLI.

To verify correct operation, interactions were performed using the CLIs, and informationfrom four sources were used to ensure that the switches were behaving as expected. Deter-mining the actual expected behavior was either based on the standard and knowledge gainedfrom implementing the protocol, or by using a topology with only Linux switches for refer-ence. The following sources were used to verify correct behavior:

• Wireshark to verify BPDUs being sent at the correct rate, and with the expected infor-mation.

• The management CLI to check the current per-switch and per-port RSTP variables inthe Python controllers.

• The Mininet visualization (see Section 5.7.4) to easily inspect the current spanning tree,and detect any anomalies.

• If necessary, the log files of the controllers, drivers and BMv2 instances.

The network interactions performed during testing included bringing switches and linksup or down, while running a ping between the hosts in the network. These arbitrary inter-actions were combined with a list of specific scenarios to include in the test, which were thefollowing:

• Cause election of a new root switch, caused by either:

– The root switch being shut off.

– Links to the root switch going down.

– Change of switch priorities through the management CLI.

32

Page 41: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

4.5. Testing the P4 Switch Implementation

• Change the spanning tree to any desired layout by modifying switch priorities and portpath costs through the CLI.

• Change the STP compatibility mode at run time using the CLI.

• Change timer intervals from the CLI. Both on the root switch, which should make otherswitches inherit them, and on a non-root switch.

• Have a shared media link, with 3 ports connected, to test the backup port role and toverify that port priorities work as expected. Since Mininet does not support addingsuch a link, it was manually configured as a workaround when this was to be tested.

4.5.4 Testing With BMv2 as a Software Switch

To verify that more complex traffic than pings was forwarded correctly by the switch, testingwas performed with real Internet traffic.

This was done using a PC with two Ethernet interfaces, and by configuring BMv2 to usethese real interfaces rather than virtual ones. The drivers and Python controller were thenrun, resulting in the PC working as a P4 switch with two ports.

One port was connected to a laptop, and the other to a LAN with access to the Internet.It was then verified that the laptop had Internet access and was able to browse web pageswithout issues.

4.5.5 Testing on Stordis Switches in a Physical Network

Testing was also performed on real hardware in a physical network. The network setup usedconsisted of four Stordis BF2556X-1T [46] switches, one Netgear GS110TP [40] switch, andtwo host PCs.

The P4 switch implementation ran on the Stordis switches, which include a Tofino6 chipfor running P4 programs. The Netgear switch was included to test interoperability with atraditional switch, and has support for RSTP. Finally, the hosts were used to send pings fortesting. The setup is visualized in Figure 4.2.

6https://barefootnetworks.com/products/brief-tofino/

33

Page 42: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

4.6. Studying Approaches for Partial Migration

Stordis 1

Stordis 2

Netgear GS110TP

Stordis 3Stordis 4

PC 1

PC 2

Figure 4.2: The physical network setup used for testing.

Note that the setup also included an out-of-band management network, not shown in thefigure. This network was connected to the management ports of the Stordis switches, whichare accessible by its CPU where the controller is running. Using this, logs from the Pythoncontroller could be accessed, and the management CLI could be used. However, as a resultof running on real hardware, no data plane log files were available, and the visualizationprogram was not used. For the Netgear switch, its management website was used to readand set RSTP configuration.

The testing performed included verifying that forwarding and MAC learning worked, asdescribed earlier, and that the switches were participating in RSTP properly. Since the RSTPlogic had already been tested in Mininet, the focus was on testing the details that differedbetween the Stordis and BMv2 versions, or were unique to running on real hardware. As aresult, the testing performed was not as thorough as the Mininet tests, and just included thefollowing:

• Pinging between the hosts while plugging arbitrary cables in and out between theswitches. This was to verify that the interactions between the RSTP logic and the dataplane worked, and that a proper spanning tree was upheld in cooperation with theNetgear switch.

• Changing RSTP configuration at run time through the management CLI and the Net-gear management website.

• Verifying that the BPDUs being sent on the wires were valid packets using Wireshark,and that they were accepted by the Netgear switch.

4.6 Studying Approaches for Partial Migration

To answer the third research question, the results from the literature review, as well as expe-rience from the P4 switch implementation, were used.

In the literature review, different articles were searched that were related to this researchquestion. This included approaches for performing partial migration and architectures for

34

Page 43: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

4.6. Studying Approaches for Partial Migration

how a hybrid network can be controlled by a centralized controller. In addition to this, studiesof the state of the hybrid SDN research field were read to find higher level approaches, andto find other articles.

One of the more straight forward approaches, which is to have the P4 switch mimic be-havior of legacy switches, was tested through the P4 switch implementation as part of thesecond research question. The knowledge gained from this also helped ensuring that theother found approaches were relevant and possible to implement with P4 switches.

35

Page 44: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5 P4 Switch Implementation

This chapter describes the resulting implementation of the second research question, which isa P4 switch that achieves Ethernet forwarding, MAC learning, and the Rapid Spanning TreeProtocol (RSTP).

The implementation has been run both on the BMv2 software switch and on StordisBF2556X-1T [46] hardware switches. The Stordis switches include a general purpose CPUand a Barefoot Tofino chip1 which runs the actual P4 program. However, this chapter willmostly focus on the architecture when run on the open source BMv2 switches, since some de-tails of the hardware switches are proprietary. The main difference when running on Stordisswitches is that a different driver and P4 program are run, resulting in a different API betweenthe Python controller and the drivers. Refer to Figure 5.2 in the architecture overview.

5.1 Features

The switch implementation has been based of the IEEE standard for MAC bridges, 802.1D-2004 [17]. It does not implement the complete standard, and this was never the goal of theproject. The main features of the switch are presented in the following list, and these havebeen implemented based on the standard.

• Ethernet forwarding

– Broadcasting and Multicasting

• MAC learning

– Flooding

– Aging

• Rapid Spanning Tree Protocol

• Management CLI for RSTP configuration

1https://barefootnetworks.com/products/brief-tofino/

36

Page 45: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5.2. Source Code

5.2 Source Code

The BMv2 version of the project has been published on GitHub2.

5.3 Architecture Overview

The architecture of the switch is divided into two main parts, the control plane and the dataplane.

Since the Rapid Spanning Tree Protocol (RSTP) is a distributed algorithm, each switchneeds a local control plane running it. This control plane runs on a general purpose CPUinside each switch, and Figure 5.1 demonstrates the setup in a network of three switches.Note that the management CLI is connected to the control planes of the switches through anout-of-band management network.

Switch 1

Management CLI

CPULocal control plane

BMv2 / TofinoData plane

Switch 2CPU

Local control plane

BMv2 / TofinoData plane

Switch 3CPU

Local control plane

BMv2 / TofinoData plane

Figure 5.1: Overview of a network of P4 switches.

The control plane is implemented mainly through two programs. The first being a Pythoncontroller which configures the switch and implements the RSTP logic. The second is a smallC++ driver program, which exposes SwitchAPI to the controller and handles the communi-cation with the data plane.

The data plane consists of the actual switch chip and the P4 code running on it. Thisis either the BMv2 software switch, or the Tofino chip of the Stordis switch. An overviewshowing the different components and how they interact in the case of the BMv2 switch isshown in Figure 5.2. Note that since BMv2 is a software switch, it runs on the same CPU asthe control plane, and the CPU port is just a virtual Ethernet pair.

2https://github.com/henli070/p4-rstp-switch

37

Page 46: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5.4. Switch.p4 Project

Data plane

Local control plane

Driverprogram

Python controllerrunning RSTP

SwitchAPI

SwitchAPI calls

Thrift server

Auto generatedpdfixed API

P4 table operations

BMv2 running switch.p4

ManagementCLI

TCP

CPU port

Switch ports

Thrift server

STP BPDUsin/out

Figure 5.2: Overview of a single P4 switch with BMv2.

The following sections will cover the different components, starting from the bottom withthe Switch.p4 data plane and drivers, then the Python controller and finally the managementCLI.

5.4 Switch.p4 Project

First off, note that this section only applies when running on the BMv2 switch since anotherproprietary P4 project was run on the Stordis switches.

The Switch.p4 project includes both the P4 code for the data plane, as well as theSwitchAPI library which is used by the drivers running in the control plane. In the fol-lowing sections, only the parts of the Switch.p4 project that are of relevance for the switchimplementation are described. Note that Switch.p4 also includes a large amount of other linklayer (see Section 6.2.2), as well as network layer functionality which was not needed for thisimplementation, but could be of interest for future extension.

5.4.1 The P4 Data Plane

The P4 code in the project is written in P414, and is divided into 28 different P4 source files.The majority of the code is limited to standard P414, with an exception being the parts relatedto packet replication, which are needed for multicasting and broadcasting. These parts rely

38

Page 47: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5.4. Switch.p4 Project

on non-standard intrinsic metadata3 defined by the simple_switch target in BMv2. The P414specification allows such custom definitions, but notes that programs using them are not fullyportable across architectures [10].

5.4.2 The Driver Program and SwitchAPI

The driver program itself is a very small C++ program running in the control plane which isresponsible for the following:

• Initializing the pdfixed library.

• Initializing the SwitchAPI library.

• Starting the SwitchAPI Thrift RPC server for use by the Python controller.

Pdfixed is a C library that is generated with the help of the p4c-bm compiler from the P4source code. This library provides a low level API for interacting with the data plane. This ismainly through exposing functions for manipulating the table entries of the P4 program. Forexample, the function p4_pd_dc_dmac_table_add_with_dmac_hit(...) is generatedto allow adding an entry with the dmac_hit action to the dmac table.

Since the pdfixed library is a very low level library, the Switch.p4 project also in-cludes another C library called SwitchAPI. SwitchAPI builds on top of pdfixed andprovides a higher level API for manipulating the data plane. For example, the callswitch_api_stp_port_state_set(...) is used to set the STP state on a port. In addi-tion to this, SwitchAPI handles both MAC learning and aging by default.

A Thrift server is also included along with the SwitchAPI library and allows it to be calledfrom other languages than C or C++, see Section 2.7. This server is connected to by the Pythoncontroller in the control plane.

5.4.3 Implementation Details

This section describes the main functionality provided by Switch.p4 including details for howit is implemented. Note that all P4 tables in the explanations are empty by default, and it isthe responsibility of the driver program and SwitchAPI to populate the tables with entries.See section 2.4.4 for background on P4 tables and actions.

Basic Ethernet Forwarding

Forwarding is mainly implemented through a P4 table called dmac, short for destinationMAC. This is the table that is populated by the MAC learning process, but it can also containstatic unicast and multicast entries. The table performs an exact match against the destinationMAC of the packet, as well as its bridge domain which is used to implement VLANs.

For the purposes of this implementation, there are three actions for this table that are ofinterest:

• dmac_hit(ifindex) is used for unicast entries, and sets the outgoing port for thepacket.

• dmac_multicast_hit(mc_index) is used for multicast entries. It sets the multicastgroup for the packet, causing it to be sent out on the ports in that group.

• dmac_miss() is the default action, and causes the packet to be broadcasted.

3https://github.com/p4lang/behavioral-model/blob/master/docs/simple_switch.md

39

Page 48: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5.4. Switch.p4 Project

MAC Learning

To support MAC learning, another table called smac, short for source MAC, is used. Thistable is meant to contain all currently learned MAC addresses, and it performs an exact matchagainst the bridge domain and the source MAC of a packet. It has two possible actions:

• smac_hit(ifindex) performs a check, and sets metadata indicating that the addresshas moved if the incoming port of the packet does not match the previously learnedone. The previous one is stored in the ifindex parameter.

• smac_miss() sets metadata to signal that the table was missed.

To perform the actual learning, a second table called learn_notify is used whichmatches against the metadata set by the smac table, as well as the RSTP state of the in-coming port. If the source MAC was a miss or had moved, and the RSTP state of the portis in one with learning enabled, then a notification is sent to the control plane using thegenerate_digest primitive in P414 [10]. This allows the driver program to update thedmac table with the learned address.

Aging of learned addresses is achieved by setting a timeout on the smac table. This causesa notification to be sent to the driver program, which removes the entry from both the smacand dmac tables.

RSTP Port States

The RSTP state for each port is stored in a table called spanning_tree. The purpose of thistable is to set metadata that causes packets incoming on a discarding port to be dropped. Themetadata is also used by the smac table to tell if learning is enabled on the port, as mentionedpreviously.

The avoidance of forwarding packets on discarding ports is mainly handled outside ofthe P4 code. Instead, this is achieved by limiting the group of ports on which packets aremulticasted and broadcasted to not include any discarding ports. The process of updatingthese groups is out of scope of the P414 standard and is therefore done using BMv2 specificmeans by the drivers.

The CPU Port and BPDUs In/Out

One switch port of the data plane is given the special role of being the CPU port (see Fig-ure 5.2). This port is connected to the control plane and it is mainly intended for two pur-poses. The first is to allow the switch to forward a packet it receives onto the CPU port,allowing the control plane to read it. Which packets are forwarded is decided by the controlplane through table entries. The second purpose is to let the control plane cause packets tobe sent out of the switch on one of the normal ports.

Both of these are achieved by defining a custom protocol on top of Ethernet. The headerfor this protocol includes metadata, such as on which port a packet was received by theswitch, or on which port the control plane wants to send a packet. The payload of this customprotocol carries the packet in question.

The CPU port plays an important role in the RSTP implementation, since the control planeneeds to both send and receive BPDUs in order to communicate with neighboring switches.To make sure that received BPDUs are forwarded through the CPU port, a table entry isadded that matches with a destination MAC of 01:80:C2:00:00:00, which is used by thespanning tree protocols.

5.4.4 Patches to the Project

A few patches were done the the Switch.p4 project fixing issues and adding some functional-ity. This following list describes the issues that were found:

40

Page 49: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5.5. Python Controller With RSTP

1. The aging timeout was implemented on the dmac table instead of the smac table. Thisresulted in addresses never aging out as long as packets were attempted to be sent tothem. MAC learning should only be based on packets received from a host, and not onpackets destined to the host. Therefore, this timeout was moved to the smac table.

2. Updates to the multicast groups were not properly synced with BMv2 after changingthe RSTP state for a port. This caused the switch to broadcast packets onto discardingports, leading to broadcast storms.

3. When changing the state of a port to discarding, it was assumed to have been part of amulticast group. This was true if the previous state was Forwarding, but not if it wasListening which led to errors.

4. Updating the aging interval only affected new learned entries, and not already exist-ing ones. This issue was not patched, and a work around was added to the Pythoncontroller instead. The reasoning behind this was that it had minimal impact on theoperation of RSTP, and due to time constraints.

5.5 Python Controller With RSTP

The Python controller runs in the local control plane and has the task of initializing the dataplane to achieve Ethernet forwarding, and to continuously run the RSTP algorithm to avoidany loops existing in the network.

5.5.1 Basic Initialization

If the BMv2 switch is run with just the drivers alone, it will not forward any packets. This isbecause the P4 tables need to be populated to achieve any kind of functionality. When usingSwitchAPI, the Python controller does this in the following steps:

1. Initialize SwitchAPI.

2. Create a default VLAN.

3. Initialize the used ports, and add them to the VLAN.

4. Setup the switch to forward BPDUs onto the CPU port, and start listening for these.

5. Set the aging time for MAC learning.

The result of this initialization is a switch that performs Ethernet forwarding and MAClearning, and provides the necessary infrastructure for the RSTP algorithm to run.

5.5.2 RSTP Implementation

The RSTP module of the Python controller is implemented following clause 17 of IEEE802.1D-2004 [17]. The standard uses natural text to describe the variables, timers and pro-cedures used, while the actual behavior of the protocol is defined by state machine diagrams.The diagrams are programming language neutral, and describe operations performed whenentering a state, as well as conditions for changing state. The full protocol definition includes8 timers, 7 per-switch variables, 45 per-port variables, and these are used by a set of 10 differ-ent state machines.

All the procedures, variables and the operations for each state are implemented to exactlyreplicate the descriptions in the standard. They are also named to match the standard, anddocumented with references to the sections they are described in. Some details of the protocolare out of the scope of the standard, and how they are achieved is up to each implementation.These will be described in the following sections.

41

Page 50: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5.5. Python Controller With RSTP

Updating the State Machines

The state machines are expected to update as soon as one of their state change conditionsevaluates to be true. These conditions depend on variables which might be modified byexternal means, other than the state machine itself. To instantly detect such changes, anadvantage is taken from the fact that there is a limited number of events that initiate anykind of change in the protocol. These are the following:

• Initialization of the protocol.

• One second of time passing. This is due to every timer of the protocol being an integralnumber of seconds.

• A BPDU being received through the CPU interface.

• Configuration changes by management.

Upon any of these events, a lock is acquired, and the state machines are continuouslyupdated until they reach a stable state. That is, one iteration without any state changes. Noprocessing, or any kind of busy waiting is done between these events.

Interactions With the Drivers

The RSTP algorithm requires control of a couple different aspects of the data plane. These areachieved through communication with the driver program, which in turn manipulates thedata plane and P4 tables. The following control is needed:

• Enabling and disabling forwarding on a port.

• Enabling and disabling learning on a port.

• Flushing all learned MAC addresses on a port. However, flushing more entries thannecessary is also allowed by the standard, even the whole learning table.

• Changing the aging time for MAC learning.

When running the Python controller with the BMv2 data plane, these are done using callsto SwitchAPI through its Thrift RPC server.

Sending and Receiving BPDUs

One interaction with the data plane which bypasses the drivers, is the reading and sendingof RSTP BPDUs on the switch’s ports. As previously mentioned, the data plane gets set up toforward any BPDUs onto its CPU port, which is connected to an interface of the CPU wherethe controller runs. Metadata about which port of the switch the packet was received on isincluded in a custom Ethernet header which encapsulates the BPDU. For sending BPDUs,the controller constructs the BPDU packet and encapsulates it in this custom header, whichallows it to inform the P4 program on which port to send out the BPDU on.

To achieve this, the Scapy library4 is used which allows sniffing and sending raw packetson an interface. Through Scapy, the formats of all packet headers, including the customone and BPDUs, are defined and linked together. This causes Scapy to automatically parseincoming raw packets into a higher level representation as a Python object, and also simplifiesconstruction and sending of such packets.

4https://scapy.net/

42

Page 51: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5.6. Management CLI

5.6 Management CLI

The management CLI is a small Python program that connects to the controller and allowsreading and setting configuration related to the RSTP protocol, based on clause 14 of 802.1D-2004.

The CLI connects to a TCP server that runs in the Python controller and a simple custombinary protocol is used for sending and responding to requests. Various information aboutthe current RSTP state concerning the whole switch, or specific to a certain port, can be re-quested. An example of this is shown in Figure 5.3 where the information about port 1 on aswitch is dumped through the CLI.

Figure 5.3: Dumping information about port 1 through the CLI.

The CLI also allows setting RSTP configuration such as priorities for switch and portidentifiers, STP compatibility mode, and the intervals for each timer of the protocol.

To implement the CLI, the Cmd framework5 is used. This framework is part of the Pythonstandard library and allows simple definition of commands, including features such as helptext and auto completion for them.

5.7 Test Environment

The test environment allows testing of the implementation by emulating a network ofswitches and hosts on a single computer. This environment is only used for the BMv2 versionof the project, since the Stordis port was tested using a physical network setup. To run theemulation, the platform Mininet (see Section 2.6) and its Python API are used.

5.7.1 Defining Custom Switch Types for Mininet

To run Switch.p4 in the emulation, and to have another RSTP implementation to test against,two switch types are defined for Mininet. This is done by extending the Switch class ofMininet and overriding its start and stop functions. Defining a new switch type allows itto be added to the network topology, and to be run in the emulation.

5https://docs.python.org/3/library/cmd.html

43

Page 52: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5.7. Test Environment

The Switch.p4 Switch Type

This switch type enables Mininet to start and stop BMv2 running the Switch.p4 project andthe Python controller. Its start function creates the virtual Ethernet pair for the CPU port,then runs BMv2, the drivers and the Python controller with appropriate arguments.

The Linux Switch Type

Another switch type is defined to allow emulation together with another implementation ofRSTP. This is achieved by using the bridge-utils Ubuntu package, together with the Mstpd6

project.Bridge-utils allows creating a Linux software switch, which is capable of running STP.

Mstpd provides an MSTP implementation for bridge-utils, which can be forced into RSTPmode.

5.7.2 Defining Network Topologies

With the two switch types defined, network topologies including them can be createdthrough the Mininet API. This is a simple process which is done through calls to the threefunctions addHost(), addSwitch() and addLink().

The addSwitch() function lets the caller specify the type of switch to add, and argu-ments to send it. Both the Switch.p4 and Linux types accept an argument whether theyshould be started in STP or RSTP mode, which is useful for testing STP compatibility ofthe RSTP implementation.

5.7.3 Running the Emulation

To run the Mininet emulation, a test program in Python is used. This program takes anargument of which topology to use and initializes Mininet with it. It then starts the Mininetemulation along with the visualization. When started, Mininet creates all necessary virtualEthernet pairs based on the links in the topology, and runs the start method on each switch.

Finally, the Python program opens the Mininet CLI for the test user to interact with theemulation. This allows the user to start or stop switches, and to bring links up or down.

5.7.4 Mininet Visualization

When the test program starts the Mininet emulation, it also opens a window showing a viewof the network that updates in real time. This view shows a graph of hosts, switches andlinks based on the chosen Mininet topology, and visually provides the following informationto the test user:

• The type and status of nodes. Hosts are shown as blue, while switches are green whenrunning or gray if stopped.

• Status of the links between any two nodes. This is based on if the link is down or not,as well as the RSTP port states of its endpoints. A link can be either down, forwarding,blocked or pending.

– Downed links are shown with a dotted line. The real life equivalent could be abroken or unplugged cable.

– Forwarding links are black, and these are links where neither endpoint is blockingthe traffic.

6https://github.com/mstpd/mstpd

44

Page 53: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5.7. Test Environment

– Blocked links are shown as red. This means that one of the switches is discardingtraffic on its port to this link.

– Pending links are orange, and this is used in all other cases. For example, thiscould be a link where one switch is in the learning state, meaning it will soonbecome forwarding.

The visualization lets the test user easily see the current spanning tree in the network,without having to read through log files from each Python controller. An example is shownin Figure 5.4, where the Mininet emulation is run with 3 hosts and 6 switches. Notice how theRSTP algorithm running on each switch has created a spanning tree throughout the network.

Figure 5.4: Visualization of the nodes and links in the Mininet network.

Reading Port States

To show the status of each link, the visualization needs access to the RSTP states of eachswitch’s ports in the network.

For switches running Switch.p4 and the Python controller, this is achieved by connectingto the configuration server of each controller. This is the same server that the managementCLI uses, and it allows requesting the state of a certain port over TCP.

To do this for the Linux switches, shell commands are used instead. Bridge-utils providesthe brctl showstp command, which is used to dump STP information about a certainLinux switch. The output from this command is then parsed to determined states for eachport.

45

Page 54: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

5.7. Test Environment

Drawing the Graph

Drawing the graph is a simple process with the help of two Python libraries, NetworkX7 andMatplotlib8. The network topology is first described as a graph to NetworkX, which includesalgorithms for creating layouts for graph drawing. This means that the positions of each nodedoes not have to be specified, but are calculated automatically.

Matplotlib is then used for creating the plot window and to setup continuous callbacks forthe animation of it. On every callback the elements of the graph are drawn using NetworkX,which also uses Matplotlib internally.

7https://networkx.github.io/8https://matplotlib.org/

46

Page 55: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

6 Results

This chapter presents the results that were obtained for each of the three research questions. Itdoes not directly answer the questions, but provides the basis needed for this. See Section 8.1in the Conclusion chapter for the answers.

6.1 Link Layer Functionality

This section presents the results from studying link layer functionality, including specific pro-tocols for achieving them. The following list shows general functionality that was identifiedto be of interest for an Ethernet switch. Refer to section 2.2 in the Background chapter formore details on the definitions.

• Ethernet forwarding

– Learning

– Flooding

– Aging

• Broadcasting

• Multicasting

– IGMP snooping

• Redundancy and loop prevention

– Spanning Tree Protocol (STP)

– Rapid Spanning Tree Protocol (RSTP)

– Multiple Spanning Tree Protocol (MSTP)

– Shortest Path Bridging (SPB)

• Virtual LANs and trunking

• Link aggregation

47

Page 56: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

6.2. Open Source P4 Solutions

– Link Aggregation Control Protocol (LACP)

• Discovery

– Link Layer Discovery Protocol (LLDP)

In the case of a pure SDN network where interoperability with legacy switches is not needed,there is also no need for many of the protocols in the above list. The main functionalitiespresented are still of interest, but the SDN controller has more options for achieving themdue to its central knowledge and ability to configure the switches.

6.2 Open Source P4 Solutions

This section will present open source projects which are either full solutions, or buildingblocks that can be used for P4-programmable switches.

Each section includes a table with the three columns: functionality, implementation andnotes. The functionality column contains the set of relevant features defined in section 6.1.The implementation column specifies if the project supports the feature in the data plane,control plane or in both. Finally, the notes column gives additional information about theimplementation, such as protocols supported, or missing behaviour.

6.2.1 Tutorials and Examples

The projects presented in this section are simple examples or tutorial exercises. This meansthat they do not contain any complete solutions, but can be used as building blocks in one.

Tutorials by the P4 Language Consortium

The official tutorials for the P4 language1 contain a set of exercises as well as solutions forthem. Table 6.1 shows what link layer functionality was found in the exercises and theirsolutions. Both forwarding and multicasting were found in the multicast exercise.

Table 6.1: Link layer functionality in the P4 tutorials.

Functionality Implementation NotesEthernet forwarding Both planes Has flooding, but no MAC learningBroadcast/Multicast Data plane -Loop prevention None -Virtual LANs None -Link aggregation None -Discovery None -

The data plane parts of the tutorials are written in P416, targeting the V1Model architec-ture. These are run on BMv2 in a network emulated by Mininet. As for the control plane, it iswritten in Python and uses P4Runtime to program the switches.

The repository is active at the time of writing this report, and is well documented.

The p4-learning Repository

This repository2 includes many exercises and examples. Out of these, the two examplesl2_learning and multicast, as well as the three exercices 03-L2_Basic_forwarding, 03-L2_Floodingand 04-L2_Learning, all implement relevant link layer functionality. Table 6.2 shows whatfunctionality is covered.

1https://github.com/p4lang/tutorials2https://github.com/nsg-ethz/p4-learning

48

Page 57: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

6.2. Open Source P4 Solutions

Table 6.2: Link layer functionality in the p4-learning repository.

Functionality Implementation NotesEthernet forwarding Both planes Flooding and MAC learning, only missing agingBroadcast/Multicast Both planes -Loop prevention None -Virtual LANs None -Link aggregation None -Discovery None -

The examples are written in P416 and target the V1Model architecture. The control planesare written in Python, but do not use P4Runtime. Instead, an API from the p4-utils reposi-tory3 is used, which works directly on top of BMv2.

The MAC learning example comes in two different implementations with different meth-ods for communicating with the controller. One uses the digest extern to send small mes-sages, and the other clones the incoming packet to the controller using the clone3 extern.

The project is well documented and has received activity as recent as during this thesisproject.

The p4-researching Repository

This repository4 contains a variety of example programs. Table 6.3 shows the available linklayer functionality found. These are from two example programs, digest and learning-switch,in the repository.

Table 6.3: Link layer functionality in the p4-researching repository.

Functionality Implementation NotesEthernet forwarding Both planes Has MAC learning, but only for ARP packets,

and missing agingBroadcast/Multicast Both plane Partial, enough to support ARPLoop prevention None -Virtual LANs None -Link aggregation None -Discovery None -

Two methods for achieving MAC learning are demonstrated here as well, although onlyfor Address Resolution Protocol (ARP) packets. In the learning-switch example, packets withunknown destination MAC addresses are sent to the control plane by forwarding them outon the CPU port. In the digest example, the digest extern is used instead.

All examples in the repository are written in P416 for V1Model and use a PythonP4Runtime controller. The examples are complete in that both the data and control planescan be run and tested within Mininet.

At the time of this thesis project, the last update for the repository was in December 2018.Its usage and implementations are well documented, although some parts are in Chinese.

6.2.2 The Switch.p4 Project

The Switch.p4 project5 aims to implement the data plane of a data center switch, and is anupdated version of DC.p46 (see Section 3.1).

3https://github.com/nsg-ethz/p4-utils4https://github.com/kevinbird61/p4-researching5https://github.com/p4lang/switch6https://github.com/p4lang/papers/tree/master/sosr15

49

Page 58: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

6.2. Open Source P4 Solutions

Table 6.4 shows the relevant functionality provided by Switch.p4, which is written in theolder P414 language version.

Table 6.4: Link layer functionality provided by the Switch.p4 project.

Functionality Implementation NotesEthernet forwarding Both planes Fully featured MAC learningBroadcast/Multicast Both planes* Including IGMP snoopingLoop prevention Data plane Support for any spanning tree protocolVirtual LANs Both planes* Supports trunkingLink aggregation Both planes* Data plane supports LACPDiscovery Data plane Support for LLDP* Control plane in the form of a test case.

Control Plane Support

While the project mainly focuses on a data plane implementation, it does provide higher levelAPIs for a control plane to use. These are called SwitchAPI and SwitchSAI.

SwitchAPI is an API that was specifically designed for Switch.p4 and simply providesa higher level of abstraction to avoid the control plane having to manage all tables in thedata plane manually. SwitchSAI builds on top of SwitchAPI and implements the SwitchAbstraction Interface (SAI)7 standard. As a result, the switch can be controlled by any controlplane that has SAI support, such as SONiC (see Section 6.2.4). The project also contains testswhich demonstrate the use of the APIs.

Activity and Documentation

The open source version of Switch.p4 has not seen activity on Github since January 2018,and is still written for the older P414 version of the language. The documentation of usageand implementation was found to be lacking, and the P4 parts of it had no documentation8.However, it was sufficient enough for building the project and to run the tests.

6.2.3 The Sai_bridge.p4 Project

The SAI-P4-BM repository9 contains the Sai_bridge.p4 data plane, as well as an implementa-tion of the SAI standard on top of it. Table 6.5 shows what link layer features were identifiedto be implemented in this project.

Table 6.5: Link layer functionality provided by the Sai_bridge.p4 project.

Functionality Implementation NotesEthernet forwarding Data plane Has MAC learningBroadcast/Multicast Data plane Dependant on control plane populating tablesLoop prevention Data plane Support for any spanning tree protocolVirtual LANs Data plane Supports trunkingLink aggregation Both planes Example with LACP on top of SAIDiscovery Data plane Support for LLDP

7https://github.com/opencomputeproject/SAI8https://github.com/p4lang/switch/issues/959https://github.com/Mellanox/SAI-P4-BM

50

Page 59: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

6.2. Open Source P4 Solutions

Control Plane Support

The only higher level API on top of this data plane is the Switch Abstraction Interface (SAI),which can be used to control the switch. This means that it can be used with for example theSONiC controller (see Section 6.2.4).

Activity and Documentation

At the time of this report, the last significant activity in the project was in December 2017.The data plane parts are written in the P414 language version, but there exists a git branchwhere it has been ported to P416. However, the documentation has not been updated on thisbranch, and attempts to build the P416 version of the project were unsuccessful. Building andrunning tests for the P414 version was achieved though.

There also exists some documentation10 for using the project together with SONiC, whichwas tested successfully.

6.2.4 The SONiC Network Operating System

SONiC11 is a Linux based network operating system that can run as a control plane onswitches from multiple different vendors. The main requirement is that the switch supportsthe SAI standard. Since P4 is more general than SAI, P4 data planes can implement the stan-dard, which is done by both Switch.p4 and Sai_bridge.p4. As such, they can be controlledby SONiC, which supports the functionality shown in Table 6.6. Although SONiC is opensource12, it is a relatively large project and therefore its documentation had to be relied on indetermining its supported functionality.

Table 6.6: Link layer functionality of SONiC.

Functionality Implementation NotesEthernet forwarding Control plane Has MAC learningBroadcast/Multicast Control plane -Loop prevention None STP support plannedVirtual LANs Control plane With trunkingLink aggregation Control plane Supports LACPDiscovery Control plane Supports LLDP

It should be noted that while SONiC is referred to as a network operating system, it is not acentralized SDN controller. Instead SONiC runs on and controls each of the switch devicesin a distributed approach.

Support for P4 Specific Functionality

The SAI API provides a general abstraction of a switch and is intended to work with manydifferent devices. While it is very general, it also includes more specific functionality whichis optional for a switch to implement. In order to control any custom P4 specifics with SONiCand SAI they have to be extended, and this can be achieved through SAI extensions [30].There also exists a backend for the P4 compiler called saiflex, which allows generating SAIheaders based on the input P4 program13.

10https://github.com/Azure/SONiC/wiki/SONiC-P4-Software-Switch11https://azure.github.io/SONiC/12https://github.com/Azure/sonic-buildimage13https://github.com/opencomputeproject/SAI/tree/master/flexsai/p4

51

Page 60: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

6.3. P4 Switch Implementation

Activity and Documentation

The project is actively developed at the time of this thesis project and the future plans for theproject are documented in a road map14 on their Github wiki. The same wiki also documentsthe usage and development of the project.

6.2.5 The ONOS Controller and Fabric.p4

The Open Network Operating System (ONOS)15 is a general SDN controller that includessupport for P4 and P4Runtime. The project also includes a data plane implementation calledFabric.p416, written in P416, which is designed to work with Trellis [6]. Trellis is a leaf-spineswitching fabric, which includes a set of ONOS applications suitable for data centers [22].

The link layer features provided are presented in Table 6.7, but note that the Trellis appli-cations are intended for leaf-spine networks and also includes many network layer featureswhich are out of scope for this thesis.

Table 6.7: Link layer functionality of ONOS with fabric.p4.

Functionality Implementation NotesEthernet forwarding Both planes Centralized controlBroadcast/Multicast Both planes -Loop prevention Both planes Centralized control, no STPVirtual LANs Both planes With trunkingLink aggregation None -Discovery Both planes Supports LLDP

Support for P4 Specific Functionality

In addition to target-independent applications such as Trellis, ONOS also allows creating ap-plications with direct control of the P4 data plane. This is achieved by what ONOS refers toas a pipeconf. A pipeconf contains everything necessary to describe the data plane and let itbe controlled by ONOS applications. Defining pipeconfs is necessary both for the target-independent applications and for P4 specific applications to be able to control a P4 dataplane. [6]

As of ONOS version 2.1, released in April 2019, the following P4Runtime features werenot possible to achieve using the ONOS API: registers, digests and parser value sets. [7]

Activity and Documentation

The ONOS project was under active development at the time of this thesis report, and is welldocumented through its wiki17. The wiki contains a section specific to using ONOS togetherwith P4 programmable switches18 which has general information and links to tutorials. Thisincludes a tutorial for running Fabric.p4 with Trellis on BMv2.

6.3 P4 Switch Implementation

The result of the second research question is an implementation of a P4 switch with Ethernetforwarding, MAC learning, and the Rapid Spanning Tree Protocol. This implementationexists in two versions, one running on BMv2 software switches, and the other being a port

14https://github.com/Azure/SONiC/wiki/Sonic-Roadmap-Planning15https://github.com/opennetworkinglab/onos16https://github.com/opennetworkinglab/onos/blob/master/pipelines/fabric/impl/src/main/resources/fabric.p417https://wiki.onosproject.org18https://wiki.onosproject.org/display/ONOS/P4+brigade

52

Page 61: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

6.4. Approaches for Migration to P4 and SDN

to run on hardware Stordis switches. Both consist of four parts: P4 code, drivers, a Pythoncontroller, and a management CLI.

Due to practical reasons, the description of the implementation has been moved into aseparate chapter. See Chapter 5, P4 Switch Implementation for this description.

6.3.1 Results From Testing

Other than finding bugs in the Python controller and the CLI during development of these,the testing also found issues in the third-party parts of the project.

Issues Found in Switch.p4

Four issues were found in the Switch.p4 project during testing. Out of these, three werepatched while one was simply worked around due to its low impact and complexity to prop-erly fix. A description of these issues and their patches is found in Section 5.4.4.

Temporary Broadcast Storms

In some network situations, both in the tests performed with BMv2 as well as on the Stordisswitches, it was found that temporary loops were created in the LAN even though the RSTPimplementation was running on each switch.

The situation where this was encountered was when the root switch of the network was“lost”. What is meant with this is that the root switch either was shut off, was given a worsepriority, or was isolated from a part of the LAN by links going down. This resulted in thespanning tree becoming unstable for a duration and temporary loops being created. Theseloops existed for approximately one second, and caused a broadcasting storm throughout thenetwork during that time.

This issue was concluded to be a property of the Rapid Spanning Tree Protocol, as op-posed to a bug in the implementation. This was supported in studies by Elmeleegy et al. [19][20], and by Myers et al. [39], which identify it as a count to infinity problem of the protocol.

Invalid Ethernet Frames

During testing on the hardware Stordis switches, it was noticed through Wireshark that theBPDU packets being sent out of the switches contained an invalid Ethernet frame check se-quence. This issue was not present in packets simply being forwarded by the switch.

However, these BPDU packets were still accepted by other Stordis switches, the Netgearswitch, and the Linux hosts. Due to this and a limited time remaining, the cause of the issuewas not found during the thesis project.

6.4 Approaches for Migration to P4 and SDN

This section presents approaches to partial migration of a network to P4-programmableswitches and the SDN paradigm, needed for answering the third research question. Thisincludes results from the literature review, and knowledge gained from the P4 switch imple-mentation. The following sections will cover the general approaches found, as well as refer tospecific architectures implementing them. These architectures are mainly intended for SDNswitches using OpenFlow. However, the actions they perform, such as having the controllersend and receive packets from the switch’s ports are also possible using P4 switches. Thiswas shown in the implementation of the P4 switch in this thesis.

53

Page 62: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

6.4. Approaches for Migration to P4 and SDN

6.4.1 Implementing Traditional Protocols

This is a straight forward approach where traditional behavior and protocols are imple-mented on the switches in P4, and with local control planes. This means that the P4 switchesare designed to behave as traditional switches, and implies that no centralized controller isnecessary. There is nothing preventing one that interacts with only the P4 switches how-ever. As for taking advantage of the P4 language, the only limiting factor is that neighboringswitches are not guaranteed to be other P4 switches, which reduces the benefits of introduc-ing custom protocols for example.

By using this approach, no special consideration needs to be taken when replacing a tra-ditional switch with a P4 switch, assuming all wanted protocols have been implemented inP4.

The P4 Switch Implementation

The implementation which was made as part of the second research question uses this ap-proach, see Chapter 5. It does this by implementing MAC learning, and the Rapid SpanningTree Protocol, following the IEEE 802.1D-2004 [17] standard. This means that it can cooperatewith or replace any traditional switch following a compatible standard.

6.4.2 Division Into SDN and Traditional Islands

With this approach, the network is divided into different islands, where each island eitheronly contains SDN switches, or traditional switches. This means that the traditional islandswill work using distributed protocols, while the SDN islands have a centralized controller.

With this approach, the full benefit of SDN and P4 can be achieved, such as completelycustom protocols, although only within each of the SDN islands. The traditional islands areinstead left with no benefits whatsoever.

Constructing Spanning Trees Within and Across Islands

Wang et al. [47] (see Section 3.3.1) describe a method for constructing a spanning tree in thesetypes of networks with SDN and traditional islands. This allows keeping the network as aLAN. The other alternative would be to introduce network layer components, such as routers,between the islands.

6.4.3 Abstractions Over the Legacy Switches

This approach involves strategical placement of SDN capable switches in the network, toallow providing an abstracted view of the network that hides the traditional switches. Howthis is achieved, and what kind of abstraction is created, differs between the three specificarchitectures that were found of this approach. However, common for all of them is thatthey allow for a centralized controller to have partial control over the whole network. It alsoresults in the SDN switches being spread across the network, limiting the use of any customprotocols in the P4 switches.

Panopticon

The Panopticon [35] architecture uses this approach to create an abstraction where the legacyswitches are represented by point-to-point links between SDN switches. By strategical place-ment of the SDN switches, more such links can be represented and more control is achieved.See Section 3.3.2.

54

Page 63: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

6.4. Approaches for Migration to P4 and SDN

Telekinesis

Telekinesis [29] does not create a logical view of the network, but it does provide means fora controller to update paths through the legacy parts of the network. This is achieved byhaving the SDN switches manipulate MAC learning entries of neighboring legacy switches.See Section 3.3.3.

HybridFlow

HybridFlow [27] is an architecture for providing a view that hides all legacy switches in ahybrid SDN network. This is done by grouping the network into clusters, each containingone SDN switch and multiple legacy ones. In the abstracted view, each cluster is representedas a single SDN switch. The paper presenting HybridFlow does not go into as great detailabout how it is implemented as Panopticon or Telekinesis, but it does still present the mainideas behind it. See Section 3.3.4.

6.4.4 SDN and P4 at the Edges

This is an approach where only switches at the edges of the network are replaced with SDNcapable variants. This results in a separation of the edge and the core of the network, wherethe intelligence is contained in the edge while the core handles the transportation. This wayof realizing SDN networks is advocated by Casado et al. [5] in their retrospective of SDN.

With this approach, full control is achieved over the packets that enter and exit the net-work, especially with the flexibility of P4. However, it completely leaves paths and forward-ing behavior inside of the network to the legacy switches.

55

Page 64: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

7 Discussion

This chapter starts by discussing interesting aspects about the results for each of the threeresearch questions. Then the methodology that was used to obtain these results is discussed.

7.1 Open Source P4 Solutions

Something that stood out from these results was that only three complete data plane projectswere found: Switch.p4, Sai_bridge.p4 and Fabric.p4. Out of these, only Fabric.p4 was beingactively developed and used the newer P416 version of the language. For each of these three,only one complete open source control plane solution was found. For Fabric.p4 this wasONOS, and for Switch.p4 and Sai_bridge.p4 it was SONiC. Writing a custom controller is ofcourse always a possibility, and this option was chosen for the RSTP implementation in thisthesis. This is also the approach used in all the tutorials and examples that were found.

In general, the documentation for several of the projects was found slightly lacking. Thisincluded the P4 implementations for Switch.p4 and Sai_bridge.p4, which were both undoc-umented and with minimal source code comments. The usage instructions for these projectsalso mainly included running the test cases, and for Switch.p4 it was unclear how it couldbe connected to the SONiC control plane. In general, this resulted in more reading of sourcecode during this part of the study than was initially expected.

7.2 P4 Switch Implementation

The P4 switch that was built ended up performing successfully in the testing, both when runon BMv2 and on the hardware Stordis switches. It was also shown to cooperate with otherthird party switches properly. The following sections will cover some interesting aspects ofthe implementation.

7.2.1 Separate Versions for BMv2 and Stordis

It would have been beneficial if the same P4 source code and API were used for both theBMv2 and Stordis switches. However, due to there being a better alternative to the opensource Switch.p4 project for the Stordis switches, this was used instead. Unfortunately, thiswas not an open source project and details about it have therefore been left out of this thesis.

56

Page 65: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

7.3. Approaches for Migration to P4 and SDN

One positive aspect found from this was that it was relatively easy to port the RSTP imple-mentation to a different data plane project. This is due to the RSTP algorithm only requiringrelatively few interactions with the data plane. The requirements on the data plane only in-clude being able to set ports as blocking or forwarding, enable or disable learning on a port,clear entries from MAC learning, and receive and send BPDUs on specific ports.

7.2.2 Consequences of Using Python

As described in the method chapter, there were two main choices for programming languagewhen implementing the controller. These were C++ and Python, and Python was chosen.

A direct consequence of this choice is that the functions of SwitchAPI that assist withreading and sending packets on the CPU port are not used, since they are not part of theThrift API. Instead, the Python library Scapy is used for this purpose. One potential concernwith this is that the overhead of Python and Scapy, which is documented to not focus onperformance1, could be significant enough to cause issues under high load. However, sinceRSTP is driven by timers with intervals of whole seconds, this is not considered a problemduring normal operation of the protocol.

In terms of development and the robustness of the implementation, there were both pos-itives and negatives with Python. One problem experienced was errors in some of the rarerstates of the RSTP algorithm. Since Python is interpreted, these errors were not found untilthe state in question was entered, and therefore they could remain unnoticed for a long time.On the contrary, a very positive aspect of Python is that attempting to access an undefinedproperty of a class gives an error. This allows convenient detection of when a RSTP state triesto access undefined state variables. In a naive C++ implementation, this would simply readan arbitrary value.

7.2.3 Temporary Packet Storms

The temporary packet storms that were encountered during testing were identified to be dueto a count-to-infinity problem in the RSTP standard. While these problems were rare and of ashort duration, there exist suggestions for improving this issue. One such is by Muhammadin his proposal of DRSTP [2], which attempts to solve the count-to-infinity problem whileretaining compatibility with standard RSTP and STP.

7.3 Approaches for Migration to P4 and SDN

The results for this research question suggest that benefits of P4 and SDN can be achieved ina partially migrated network.

In the SDN case, this was clear in the approach with SDN and non-SDN islands, and inthe Panopticon architecture for abstracting the network into a logical one. For the islands,full benefits of a SDN controller can be achieved within each SDN island. Panopticon in-stead provides control over the whole network, but through a view that hides many of theswitches. Similar to Panopticon is the HybridFlow architecture which has a different methodfor creating the logical network view. However, the paper presenting it (see Section 3.3.4)does not go as in depth as Panopticon, and it is therefore a bit unclear exactly how it wouldbe implemented, and how they compare.

As for Telekinesis, it only focuses on path enforcement through a hybrid network. Whilethe authors did show that this can be successfully achieved by manipulation of legacyswitches, this also came with issues such as unstable paths which they referred to as pathflapping. Telekinesis is also not transparent to the control plane, like Panopticon and Hybrid-Flow are with their logical views, and therefore requires special care in the controller. Due to

1https://scapy.readthedocs.io/en/latest/usage.html

57

Page 66: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

7.4. Method

this and the complexity introduced, it is debatable if the increased path control is worth it inmost cases.

7.4 Method

This section comments on, and discusses limitations of the methodology used in conductingthis thesis project.

7.4.1 Literature Review

One limitation in the literature review was that only one academic database was used, andthis was Google Scholar. While Google Scholar has been estimated by researches to be thelargest academic database [25], there is a risk that some literature was missed due to thislimitation. This could include related work to this thesis, as well as approaches for migrationto P4 and SDN for the third research question.

7.4.2 Finding Open Source Solutions

When looking for open source P4 projects, most effort was spent searching on GitHub, whichis a popular service for hosting Git projects. It is also where the P4 Language Consortium2

keep their projects, meaning that any forks of these would likely be on GitHub as well. How-ever, there could have been efforts into finding, and searching on other similar services. Thiscould potentially have led to more projects being found, but it was deemed highly likely thatthe majority of such project would be on GitHub.

7.4.3 Evaluating Open Source Solutions

A significant part of the evaluation of functionality for the projects ended up requiring read-ing through source code. This was due to a lack of documentation for some of the projects.When drawing conclusions in this way, there is always a risk of misunderstanding if or how afeature is implemented in the project, and it also takes a lot of time. In hindsight, it could havebeen beneficial to try and contact previous contributors to these projects to ask questions.

7.4.4 P4 Switch Implementation

One interesting decision taken during the implementation part of this project was to startwith STP instead of going directly for the end goal of RSTP. At the time, this seemed like agood idea and the motivation behind it is described under the method chapter in Section 4.4.3.However, this did end up taking a fair bit of time, and the STP part of the project ended upbeing abandoned before it was considered complete. This was since enough experience hadbeen obtained to start with the RSTP implementation at that point.

The differences between STP and RSTP ended up being large enough that a big chunkof the code written had to be thrown away, even though RSTP has backwards compatibilitywith STP. Because of this, it is debatable whether this was a good decision. In hindsight, it islikely that time would have been saved if RSTP was implemented directly.

However, with the knowledge available at the time of this decision, it was definitely thesafer option due STP being a simpler standard. This is especially true considering the factthat it was not known how well the Switch.p4 project worked at this point. Debugging boththe more complex RSTP algorithm and Switch.p4 at once could have been problematic.

2https://p4.org/

58

Page 67: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

7.5. The Work in a Wider Context

7.4.5 Testing of the Implementation

The testing of the P4 switch worked well thanks to the virtual test environment that was cre-ated. This allowed very fast testing with different network topologies on a single computer.However, the resources of a single computer are limited and because of this it was not viableto run that many instances of BMv2 at once. The topologies that were used therefore hada maximum of six switches, which is a fairly small network. It probably would have beenviable to run bigger networks than this, but this would have lead to issues in properly draw-ing the network in the visualization program as well. Either way, there is a possibility that alarger network could detect issues that did not show up in the smaller ones.

Another aspect of the testing worth mentioning was that it was fully manual and that itwas a form of black-box testing where the verification mainly involved comparing the result-ing spanning tree with the expected one. There were also no exact test cases defined, but alist of interesting edge cases of the protocol were noted down to ensure they were includedin the tests. Still, a lot of actions taken during the tests were arbitrary.

Having automatic tests would have been very useful, so one option that was consideredwas to use the same framework as Switch.p4, which is the Packet Test Framework3. Thiscould have been used to perform automatic unit tests on a single switch. However, becauseof the time these would have taken to write, and since it was more interesting to perform testsinvolving multiples switches, this option was not investigated further. Automatic tests withmultiple switches was also dismissed as a too complex and time consuming task to performduring this thesis project.

7.4.6 Study of Migration Approaches

The biggest limitation in this study was the time allocated for it. The main focus of thisthesis was on the P4 open source solutions and especially the P4 switch implementation.Therefore, this study was focused on providing an overview of the different approaches andarchitectures. If more time was available, a more in depth study could have been performedwith a comparison of the different approaches.

It would also have been interesting to relate these approaches to the SDN controllerONOS, which is compatible with P4 switches. With some more experimentation, specific im-plementation details about how these approaches can be realized with P4 and ONOS couldhave been included in the results.

7.5 The Work in a Wider Context

This thesis project makes contributions to the open source P4 community, and presents op-tions intended to aid migration from traditional switches to P4-programmable switches.

Use of the P4 language and white box switches results in more open and transparentnetworking as a whole. It avoids vendor lock-in since the language is standardized andaims to be portable across different devices. Addition of new features no longer requiresdepending on the vendor, or wastefully buying a new switch altogether. Instead, this is justa matter of changing the source code and loading the new P4 program onto the switch.

The flexibility of the language promotes new innovative solutions and specialized usecases. These solutions can then, if desired, be shared with the community. This can alsolead to reduced duplicated efforts, as opposed to the traditional switches where each vendorre-implements the same protocols.

3https://github.com/p4lang/ptf

59

Page 68: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

8 Conclusions

The aim of this thesis project was to look at different aspects of migrating networks to P4-programmable switches. This has been done by investigating available open source solutionsand by using one of these to build a P4 switch providing redundancy and loop preventionthrough the Rapid Spanning Tree Protocol. This implementation provides a base that canbe extended, and was published as an open source project. In addition to this, differentapproaches were studied for performing this migration without requiring replacement ofevery switch in the network.

8.1 Answers to the Research Questions

In the following three sections, the research questions defined in Section 1.3 are answered inorder based on the obtained results.

8.1.1 Open Source P4 Solutions

The relevant link layer functionality for an Ethernet switch can be found under Section 6.1 inthe Result chapter.

For the data plane of a P4 switch, all of these link layer features are possible to replicatewith the Switch.p4 (Section 6.2.2) and Sai_bridge.p4 (Section 6.2.3) projects. However, thecontrol plane compatible with these, SONiC (Section 6.2.4), is missing support for any span-ning tree protocol.

The SDN controller ONOS and the Fabric.p4 data plane (Section 6.2.5) provide all ofthe link layer features except for link aggregation. Since this solution is based on the SDNparadigm, its method for providing redundancy and loop prevention does not use any span-ning tree protocol, and is therefore not directly compatible with legacy devices.

8.1.2 P4 Switch Implementation

The data and control planes for a P4-programmable switch can be implemented to performMAC learning and the Rapid Spanning Tree Protocol (RSTP). This is shown by using theSwitch.p4 project as data plane, and by having a local control plane in each switch where the

60

Page 69: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

8.2. Future Work

RSTP logic is implemented. Chapter 5 provides a description of this implementation, and itssource code can be found on GitHub1.

8.1.3 Approaches for Migration to P4 and SDN

At least four general approaches exist for partially migrating a traditional network to use P4or SDN. These, and specific architectures of them, are described in Section 6.4. The generalapproaches are the following:

1. Implementing traditional protocols.

2. Division into SDN and traditional islands.

3. Abstractions over the legacy switches.

4. SDN and P4 at the edges.

All of these are found to benefit from the flexibility of the P4 language, but the useful-ness of defining custom protocols, or anything that involves interactions between switches,becomes limited. However, in the approach involving SDN islands, full advantage can beachieved within each island.

The specific implementations of the second and third approaches, which are also foundunder Section 6.4, demonstrate that a centralized SDN controller can provide benefits to ahybrid network.

8.2 Future Work

Several link layer features are missing from the P4 switch implementation, such as link aggre-gation, VLANs and discovery mechanisms. To allow implementing VLANs with support fortrunk ports, the Rapid Spanning Tree Protocol algorithm would need to be upgraded to theMultiple Spanning Tree Protocol, which is designed to handle these. The P4 switch that wasbuilt also does not take advantage of the flexibility of the P4 language. Interesting possibili-ties here include creating custom protocols, adding metadata to packets and in-band networktelemetry2.

A possible continuation of the study concerning the third research question could be toinvestigate these approaches further, and see if they can be viably implemented with P4 andthe ONOS SDN controller.

1https://github.com/henli070/p4-rstp-switch2https://p4.org/p4/inband-network-telemetry/

61

Page 70: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

Bibliography

[1] Rashid Amin, Martin Reisslein, and Nadir Shah. “Hybrid SDN networks: A survey ofexisting approaches”. In: IEEE Communications Surveys & Tutorials 20.4 (2018), pp. 3259–3306.

[2] Syed Muhammad Atif. “DRSTP: A Simple Technique for Preventing Count-to-Infinityin RSTP Controlled Switched Ethernet Networks”. In: International Journal of ComputerNetworks 2.6 (2011), pp. 278–296.

[3] Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford,Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, et al. “P4: Program-ming protocol-independent packet processors”. In: ACM SIGCOMM Computer Commu-nication Review 44.3 (2014), pp. 87–95.

[4] Mihai Budiu and Chris Dodd. “The p416 programming language”. In: ACM SIGOPSOperating Systems Review 51.1 (2017), pp. 5–14.

[5] Martin Casado, Teemu Koponen, Scott Shenker, and Amin Tootoonchian. “Fabric: a ret-rospective on evolving SDN”. In: Proceedings of the first workshop on Hot topics in softwaredefined networks. 2012, pp. 85–90.

[6] Carmelo Cascone. ONOS Support for P4. https://www.opennetworking.org/wp-content/uploads/2018/12/ONOS-support-for-P4.pdf Accessed: 2020-03-06.

[7] Carmelo Cascone. Performance evaluation of ONOS support for P4Runtime. https://wiki.onosproject.org/download/attachments/12422167/ONOS%2BP4%20SecPerf%20Workshop%20@%20TMA%202019.pdf Accessed: 2020-03-13.

[8] The P4 Language Consortium. P416 Language Specification. https://p4.org/p4-spec/docs/P4-16-v1.2.0.pdf Accessed: 2020-02-05.

[9] The P4 Language Consortium. The BMv2 Simple Switch target. https://github.com/p4lang/behavioral-model/blob/master/docs/simple_switch.mdAccessed: 2020-02-25.

[10] The P4 Language Consortium. The P4 Language Specification. https://p4.org/p4-spec/p4-14/v1.0.5/tex/p4.pdf Accessed: 2020-05-06.

[11] Rogério Leão Santos De Oliveira, Christiane Marie Schweitzer, Ailton Akira Shinoda,and Ligia Rodrigues Prete. “Using mininet for emulation and prototyping software-defined networks”. In: 2014 IEEE Colombian Conference on Communications and Comput-ing (COLCOM). Ieee. 2014, pp. 1–6.

62

Page 71: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

Bibliography

[12] Gary A. Donahue. Network warrior. OReilly, 2011.

[13] Institute of Electrical and Electronics Engineers. IEEE Standard for Ethernet. IEEE Stan-dard 802.3-2018. 2018.

[14] Institute of Electrical and Electronics Engineers. IEEE Standard for Local and MetropolitanArea Networks—Bridges and Bridged Networks. IEEE Standard 802.1Q-2018. 2018.

[15] Institute of Electrical and Electronics Engineers. IEEE Standard for Local and MetropolitanArea Networks—Link Aggregation. IEEE Standard 802.1AX-2014. 2014.

[16] Institute of Electrical and Electronics Engineers. IEEE Standard for Local and MetropolitanArea Networks—Station and Media Access Control Connectivity Discovery. IEEE Standard802.1AB-2016. 2016.

[17] Institute of Electrical and Electronics Engineers. IEEE Standard for Local and MetropolitanArea Networks: Media Access Control (MAC) Bridges. IEEE Standard 802.1D-2004. 2004.

[18] Institute of Electrical and Electronics Engineers. Media Access Control (MAC) Bridges.IEEE Standard 802.1D-1998. 1998.

[19] Khaled Elmeleegy, Alan L Cox, and TS Eugene Ng. “On count-to-infinity induced for-warding loops ethernet networks”. In: Proceedings IEEE INFOCOM 2006. 25TH IEEEInternational Conference on Computer Communications. IEEE. 2006, pp. 1–13.

[20] Khaled Elmeleegy, Alan L Cox, and TS Eugene Ng. “Understanding and mitigating theeffects of count to infinity in Ethernet networks”. In: IEEE/ACM Transactions on Net-working 17.1 (2009), pp. 186–199.

[21] Nick Feamster, Jennifer Rexford, and Ellen Zegura. “The road to SDN: an intellectualhistory of programmable networks”. In: ACM SIGCOMM Computer Communication Re-view 44.2 (2014), pp. 87–98.

[22] Open Networking Foundation. Trellis Platform Brief. https : / / www .opennetworking . org / wp - content / uploads / 2019 / 09 /TrellisPlatformBrief.pdf Accessed: 2020-03-13.

[23] The P4.org API Working Group. P4Runtime Specification. https : / / p4 . org /p4runtime/spec/v1.0.0/P4Runtime-Spec.pdf Accessed: 2020-02-05.

[24] The P4.org Architecture Working Group. P416 Portable Switch Architecture (PSA).https://p4.org/p4-spec/docs/PSA-v1.1.0.pdf Accessed: 2020-03-13.

[25] Michael Gusenbauer. “Google Scholar to overshadow them all? Comparing the sizesof 12 academic search engines and bibliographic databases”. In: Scientometrics 118.1(2019), pp. 177–214.

[26] Bruce Hartpence. Packet guide to routing and switching. OReilly, 2011.

[27] Siyuan Huang, Jin Zhao, and Xin Wang. “HybridFlow: A lightweight control plane forhybrid SDN in enterprise networks”. In: 2016 IEEE/ACM 24th International Symposiumon Quality of Service (IWQoS). IEEE. 2016, pp. 1–2.

[28] Xinli Huang, Shang Cheng, Kun Cao, Peijin Cong, Tongquan Wei, and Shiyan Hu. “Asurvey of deployment solutions and optimization strategies for hybrid SDN networks”.In: IEEE Communications Surveys & Tutorials 21.2 (2018), pp. 1483–1507.

[29] Cheng Jin, Cristian Lumezanu, Qiang Xu, Zhi-Li Zhang, and Guofei Jiang. “Telekinesis:Controlling legacy switch routing with openflow in hybrid networks”. In: Proceedings ofthe 1st ACM SIGCOMM Symposium on Software Defined Networking Research. 2015, pp. 1–7.

[30] Matty Kadosh. SONIC Extension Infrastructure. https : / / github . com /Azure / SONiC / blob / master / doc / ocp / 201903 - SONIC / workshop /SONiCExtensionInfrastructure-MLNX.pdf Accessed: 2020-03-13.

63

Page 72: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

Bibliography

[31] Barbara Kitchenham and Stuart Charters. “Guidelines for performing systematic liter-ature reviews in software engineering”. In: (2007).

[32] Diego Kreutz, Fernando MV Ramos, Paulo Esteves Verissimo, Christian Esteve Rothen-berg, Siamak Azodolmolky, and Steve Uhlig. “Software-defined networking: A com-prehensive survey”. In: Proceedings of the IEEE 103.1 (2014), pp. 14–76.

[33] James F. Kurose and Keith W. Ross. Computer networking: a top-down approach. Pearson,2013.

[34] Bob Lantz, Brandon Heller, and Nick McKeown. “A network in a laptop: rapid proto-typing for software-defined networks”. In: Proceedings of the 9th ACM SIGCOMM Work-shop on Hot Topics in Networks. 2010, pp. 1–6.

[35] Dan Levin, Marco Canini, Stefan Schmid, Fabian Schaffert, and Anja Feldmann.“Panopticon: Reaping the Benefits of Incremental tSDNu Deployment in EnterpriseNetworks”. In: 2014 tUSENIXu Annual Technical Conference (tUSENIXutATCu 14). 2014,pp. 333–345.

[36] Aditya Agarwal Mark Slee and Marc Kwiatkowski. Thrift: Scalable Cross-Language Ser-vices Implementation. https://thrift.apache.org/static/files/thrift-20070401.pdf Accessed: 2020-05-05.

[37] Isaias Martinez-Yelmo, Joaquin Alvarez-Horcajo, Miguel Briso-Montiano, DiegoLopez-Pajares, and Elisa Rojas. “ARP-P4: A Hybrid ARP-Path/P4Runtime Switch”. In:2018 IEEE 26th International Conference on Network Protocols (ICNP). IEEE. 2018, pp. 438–439.

[38] Isaias Martinez-Yelmo, Joaquin Alvarez-Horcajo, Miguel Briso-Montiano, DiegoLopez-Pajares, and Elisa Rojas. “ARP-P4: deep analysis of a hybrid SDN ARP-Path/P4Runtime switch”. In: Telecommunication Systems 72.4 (2019), pp. 555–565.

[39] Andy Myers, Eugene Ng, and Hui Zhang. “Rethinking the service model: Scaling Eth-ernet to a million nodes”. In: Proc. HotNets. Citeseer. 2004.

[40] NETGEAR. ProSAFE® 8-port Gigabit PoE Smart Managed Switch with 2 Gigabit Fiber SFP.www.downloads.netgear.com/files/GDC/datasheet/en/GS110TP.pdfAccessed: 2020-05-07.

[41] Sterling Perrin and Heavy Reading. “Bringing Disaggregation to Transport Networks”.In: Heavy Reading/Fujitsu White Paper (2015).

[42] Rich Seifert and Jim Edwards. The all-new switch book the complete guide to LAN switchingtechnology. Wiley, 2008.

[43] Seungwon Shin, Yongjoo Song, Taekyung Lee, Sangho Lee, Jaewoong Chung, PhillipPorras, Vinod Yegneswaran, Jiseong Noh, and Brent Byunghoon Kang. “Rosemary:A robust, secure, and high-performance network operating system”. In: Proceedings ofthe 2014 ACM SIGSAC conference on computer and communications security. ACM. 2014,pp. 78–89.

[44] Yash Sinha, K Haribabu, et al. “A survey: hybrid SDN”. In: Journal of Network and Com-puter Applications 100 (2017), pp. 35–55.

[45] Anirudh Sivaraman, Changhoon Kim, Ramkumar Krishnamoorthy, Advait Dixit, andMihai Budiu. “Dc. p4: Programming the forwarding plane of a data-center switch”.In: Proceedings of the 1st ACM SIGCOMM Symposium on Software Defined NetworkingResearch. ACM. 2015, p. 2.

[46] STORDIS. BF2556X-1T Advanced Programmable Switch. https://www.stordis.com/wp-content/uploads/2019/12/STORDIS_BF2556X-1T-A1F.pdf Accessed:2020-05-07.

64

Page 73: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

Bibliography

[47] Shie-Yuan Wang, Chia-Cheng Wu, and Chih-Liang Chou. “Constructing an optimalspanning tree over a hybrid network with SDN and legacy switches”. In: 2015 IEEEsymposium on computers and communication (ISCC). IEEE. 2015, pp. 502–507.

65

Page 74: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

A Installation of DevelopmentEnvironment

This appendix documents the steps that were taken when setting up the development envi-ronment that was used for the research of this thesis project.

The operating system used was Ubuntu 16.041, which was installed and run in a virtualmachine using VirtualBox2, with 30 GB of storage. The tools that were installed on the ma-chine were PI, BMv2, p4c, Mininet, p4c-bm and PTF. Since there are multiple ways to config-ure these projects during installation, the exact commands that were used will be describedin the following sections.

Note that if only running the P4 switch implementation of this thesis is desired, then seethe README.md file of the project instead.

A.1 Installing Dependencies

Many of the tools that were to be installed come from the P4 Language Consortium’s githuborganization3, and in several cases these tools share dependencies. Therefore, all these de-pendencies were installed at the same time at the start.

A.1.1 Ubuntu and Pip Packages

All required packages for the projects were collected into one list, and then they were allinstalled as Listing A.1 shows.

Listing A.1: Installation of various packages.sudo apt updatesudo apt install git automake cmake libjudy-dev libgmp-dev libpcap-dev \libboost-dev libboost-test-dev libboost-program-options-dev \libboost-system-dev libboost-filesystem-dev libboost-thread-dev \libevent-dev libtool flex bison pkg-config g++ libssl-dev libgc-dev \libfl-dev libboost-iostreams-dev libboost-graph-dev llvm python tcpdump \autoconf curl make unzip libtool-bin python-pip

1http://releases.ubuntu.com/16.04/2https://www.virtualbox.org/3https://github.com/p4lang/

66

Page 75: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

A.1. Installing Dependencies

pip install --upgrade pippip install scapy psutil ply ipaddr grpcio

A.1.2 Apache Thrift

Apache Thrift4 is a dependency of BMv2, and was installed as shown in Listing A.2.

Listing A.2: Installation of Thrift.git clone https://github.com/apache/thrift.gitcd thriftgit checkout 0.9.2./bootstrap.sh./configure --with-cpp=yes --with-c_glib=no --with-java=no --with-ruby=no \--with-erlang=no --with-go=no --with-nodejs=nomakesudo make installcd lib/pysudo python setup.py installcd ../../..

A.1.3 Nanomsg

The Nanomsg project5 is needed by both PI and BMv2, and Listing A.3 shows the installationprocess.

Listing A.3: Installation of Nanomsg.wget https://github.com/nanomsg/nanomsg/archive/1.0.0.tar.gz -O nanomsg-1.0.0.tar.gztar -xzvf nanomsg-1.0.0.tar.gzcd nanomsg-1.0.0mkdir build && cd buildcmake ..cmake --build .sudo cmake --build . --target installsudo ldconfigpip install nnpycd ../..

A.1.4 Protobuf

Protobuf6 is a project by Google and is a dependency of PI and p4c. It was installed as shownin Listing A.4.

Listing A.4: Installation of Protobuf.git clone https://github.com/protocolbuffers/protobuf.gitcd protobufgit checkout v3.6.1git submodule update --init --recursive./autogen.sh./configuremakesudo make installcd python

4https://thrift.apache.org/5https://nanomsg.org/6https://github.com/protocolbuffers/protobuf

67

Page 76: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

A.2. Installing PI

sudo python setup.py installsudo ldconfigcd ../..

A.1.5 gRPC

The gRPC project is required by PI and was installed as Listing A.5 shows.

Listing A.5: Installation of gRPC.git clone https://github.com/google/grpc.gitcd grpcgit checkout v1.17.2git submodule update --init --recursivemakesudo make installsudo ldconfigcd ..

A.2 Installing PI

The PI project7 is an open source P4Runtime implementation and it was installed as shownin Listing A.6.

Listing A.6: Installation of PI.git clone https://github.com/p4lang/PI.gitcd PIgit submodule update --init --recursive./autogen.sh./configure --with-protomakesudo make installsudo ldconfigcd ..

A.3 Installing Behavioral Model V2 (BMv2)

BMv28 is an open source software switch capable of running P4 programs, and is intendedfor development and testing. It was installed as shown in Listing A.7.

Listing A.7: Installation of BMv2.git clone https://github.com/p4lang/behavioral-model.gitcd behavioral-model./autogen.sh./configure --enable-debugger --with-pi --with-pdfixedmakesudo make installsudo ldconfigcd targets/simple_switch_grpc./autogen.sh./configure --with-thriftmake

7https://github.com/p4lang/PI8https://github.com/p4lang/behavioral-model/

68

Page 77: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

A.4. Installing the P4 Compiler (p4c)

sudo make installsudo ldconfigcd ../../..

A.4 Installing the P4 Compiler (p4c)

The p4c project9 is an open source compiler for both P416 and P414. It includes multiplebackends, including one which targets BMv2. Listing A.8 shows the installation procedurethat was used.

Listing A.8: Installation of p4c.git clone --recursive https://github.com/p4lang/p4c.gitcd p4cmkdir buildcd buildcmake ..makesudo make installcd ../..

A.5 Installing the Older P4 Compiler (p4c-bm)

The predecessor of the p4c compiler is called p4c-bm10 and it is able to compile P414 code forthe BMv2 target. It was installed to allow building some of the older open source P4 projects,and the installation steps that were used are shown in Listing A.9.

Listing A.9: Installation of p4c-bm.git clone https://github.com/p4lang/p4c-bm.gitcd p4c-bmsudo pip install -r requirements.txtsudo pip install -r requirements_v1_1.txtsudo python setup.py installcd ..

A.6 Installing Mininet

Mininet11 is an emulator for SDN networks and was used in testing both data and controlplanes. Listing A.10 shows how it was installed.

Listing A.10: Installation of Mininet.git clone https://github.com/mininet/mininet.gitsudo ./mininet/util/install.sh -nwv

A.7 Installing Packet Test Framework (PTF)

The PTF project12 is a testing framework for data planes, and was installed as described inListing A.11.

9https://github.com/p4lang/p4c10https://github.com/p4lang/p4c-bm11https://github.com/mininet/mininet12https://github.com/p4lang/ptf

69

Page 78: Migration to P4-Programmable Switches and Implementation of …1453759/FULLTEXT01.pdf · 2020. 7. 12. · Abstract P4 is a high-level language for programming the data plane of a

A.7. Installing Packet Test Framework (PTF)

Listing A.11: Installation of PTF.git clone https://github.com/p4lang/ptf.gitcd ptfsudo python setup.py installcd ..

70