embedded network systems · data plane control plane management port data ports bus engine routing...
TRANSCRIPT
Embedded Network Systems Internet2 Technology Exchange 2018
October, 2018
Eric Boyd <[email protected]>, Ed Colone <[email protected]>
perfSONAR Embedded Systems1. Background
2. Standards∙Principles∙Requirements
3. Emerging Technology
a. Vision and End Goal
b. Challenges
4. Manufacturing Collaboration
5. Advice for Next Generation Appliances
2
perfSONAR Embedded Systems
Background: Emerging Technology
● perfSONAR: over 2,000 deployments worldwide (mostly Dell servers)
● Containers emerging as an offering on traditional network devices
● Vision: perfSONAR embedded in major vendor hardware
3
perfSONAR Embedded Systems
Background: Problem of Scale
The ability to benchmark and troubleshoot the “last mile” of large distributed networks.
U-M current scope:● Number of AL switches: 101
models, 3,825 devices● Number of DL switches: 22
models, 239 devices
4
perfSONAR Embedded Systems
Background: Sample Soft Failure: Failing Optics
5
Gb
/s
normal performance
degrading performance
one month
repair
Credit: Jason Zurawski, ESNet
perfSONAR Embedded Systems
Background: ChallengeLegacy location issues:● Space● Power● Climate control
On-site support challenges:● Too many locations (>1,000)● Impossible to provide 24x7 operations ● Servers with management planes more expensive
6
perfSONAR Embedded Systems
Emerging Technology Vision
● Scalable○ Deploy○ Support○ Grow
● Fewer costs/ fewer machines
● Fewer dependencies
● Becomes a part of the in-building network replacement lifecycle
Provision perfSONAR on the network gear itself.
7
perfSONAR Embedded Systems
Standards, Principles, Requirements
Embedded Solution vs. a stand-alone perfSONAR node:● Don’t gouge: cost of the Embedded solution must
not be greater than a discreet solution
● Don’t suck: Network hardware must perform switching and routing duties unimpeded.
● Don’t lie: performance measurements must be as accurate as discreet servers.
8
perfSONAR Embedded Systems
Embedded Platform Options● “Bare Metal” port (Yocto / Wind River)
○ High effort - Embedded Linux not a dev environment○ Unsupportable
● KVM○ Poor performance○ Larger containers
● Docker, LXC○ Better performance○ Standards for creation○ Some existing support (perfSONAR on Docker)
● Proprietary container process○ Start with base image/container (Docker, LXC, etc.)○ More complex provisioning process○ Security concerns may require perfSONAR redesign /
development
9
perfSONAR Embedded Systems
Emerging Technology ArchitectureCurrent throughput constraints:● Desired performance
○ Max data port rate (10GB / 40GB / 100GB / etc.)
● Management port constraints○ ~ 1 GB (often limited to copper interfaces)
● Bus performance○ Currently < 1 GB
DATA PLANE
CONTROL PLANE
MANAGEMENT PORT
DATA PORTS
BUS
ENGINEROUTING
OS
< 1GB
~ 1GB
1 / 10 / 40 / 100 GBPFE
10
DATA PLANE
MANAGEMENT PORT
DATA PORTS
BUS
ENGINEROUTING
OS
CONTROL PLANE● Currently has most throughput performance (~1GBE)● If Management port limited to copper, could be
incompatible with the rest of the switch, bad for “leaf nodes” (can’t plug a 1G copper port into a QSFP port)
● Least desirable from Net Admin perspective○ Precludes Out-Of-Band Management
● Cable required○ Outage risk from possible misidentification as a
network loop○ Adds installation complexity, consumes additional
interfaces
perfSONAR Embedded Systems
Container Networking: Management Port
PFE
11
● Limited throughput performance (<1GBE)● Bus is shared with routing duties
○ Unknown failure modes● Cable required
○ Outage risk from possible misidentification as a network loop
○ Adds installation complexity
DATA PLANE
MANAGEMENT PORT
DATA PORTS
BUS
ENGINEROUTING
OS
perfSONAR Embedded Systems
Container Networking: Single Data Port
PFE
CONTROL PLANE
12
● Any data port “cableless”○ Most desirable○ Still goes over internal bus
● Bus is shared with routing duties○ Unknown failure modes
● Ideally, pS container would be bridged to SVI○ zero touch deployment○ does not have to consume any ports on the front○ container can be deployed without additional cables
perfSONAR Embedded Systems
Container Networking: Any Port
DATA PLANE
MANAGEMENT PORT
DATA PORTS
BUS
ENGINEROUTING
OS
PFE
CONTROL PLANE
13
perfSONAR Embedded Systems
Obstacles: Net Admin Perspective● Multiple proficiencies required to configure● Complicated, not always well documented container creation● Network Architecture
○ Container requires public IP address + small subnet per building○ rebooting a switch to solve a container issue is problematic○ “Switch First” philosophy
● Security Architecture○ Non-network admins require administrative access to the switch○ How well is the container separated from control plane?○ What other security issues does this present?
14
perfSONAR Embedded Systems
Obstacles: Hardware Limitations● High performance platforms, no “budget network device” solution yet
● Discreet solution currently has some performance advantages
● Historically switches didn’t require a lot of storage:
○ Storage quickly becomes an issue with the larger images required to run
○ Typical flash might not have the space
○ External USB storage a possibility
■ Speed / bandwidth / performance concerns
● Shared bus = shared bandwidth
15
perfSONAR Embedded Systems
Obstacles: Vendor Support Readiness
16
● Early stage of vendor documentation
● Early and varied stages of collaboration by vendors
○ Proof-Of-Concept (Company J)○ Pilot Stage (Company C)○ Alpha Stage (Company A)
● Container Networking Issues (Company A, J)
● High performance platforms, no budget solution yet
17
perfSONAR Embedded Systems
Current Lab Metrics (“Ideal” Conditions - 1Gbps)
Vendor Container Networking Latency Throughput
A Docker Docker “host” Any Port ~1.3MS ~390 Mbps
C Proprietary(Docker based)
Management Port ~2.3MS ~943 Mbps
DataPort ~1.3MS ~942 Mbps
JProprietary
(Docker or LXC based)
Management Port ~1.9MS ~943 Mbps
perfSONAR Embedded Systems
Company C Concurrent Throughput Testing● Only tested in Management Port Config
● Tested Throughput from Dell to Dell while
Router was also Throughput testing
● Speeds / Throughput were not affected by
either test, consistent results achieved.
INTERNET
DC MACCTOR
18
Does the Vendor C Device act like a good perfSONAR node?
U-M Vendor C Device & Dell 330
1 943.75 Mbps
perfSONAR Embedded Systems
perfSONAR Active
U - M Vendor C Device to Dell 330 pS node
U - MVendor C
Device
Dell
19
perfSONAR Embedded Systems
perfSONAR ActiveDoes the Vendor C Device act like a good perfSONAR node?
U-M Vendor C Device & IU Vendor C Device
1 887.32 Mbps
2 932.41 Mbps
U-M Vendor C Device to IU Vendor C Device
INTERNET
U - MVendor C
Device
IUVendor C
Device
20
perfSONAR Embedded Systems
C Routing
N
Does the Vendor C Device act like a good switch?
Dell 330 & Dell 330
1 943.19 Mbps
2 942.52 Mbps
U - MVendor C
Device
Dell Dell
Dell 330 pS node to Dell 330 pS node
21
perfSONAR Embedded Systems
perfSONAR Active and C RoutingIs pS working with switch while working?
INTERNET
U - MVendor C
Device
IUVendor C
Device
Dell
Dell 330 pS node to UI Vendor C Device
Dell 330 & IU Vendor C Device
1 940.42 Mbps
22
UM Vendor C Device & Dell 330
IU Vendor C Device & Dell 330
1 944.60 Mbps 932.64 Mbps
2 943.26 Mbps 930.76 Mbps
INTERNET
U - MVendor C
Device
IUVendor C
Device
Dell Dell
Is pS working with switch while working?
(Dell)(Dell)
Dell 330 pS node to Dell 330 pS node + U-M Vendor C Device to UI Vendor C Device
23
perfSONAR Embedded Systems
perfSONAR Active and C Routing
INTERNET
U - MVendor C
Device
IUVendor C
Device
Dell Dell
Is pS working with switch while working?
Dell 330 & U-M Vendor C Device
Dell 330 & IU Vendor C Device
1 944.60 Mbps 932.64 Mbps
2 943.26 Mbps 930.76 MbpsDell 330 pS node to U-M Vendor C Device and Dell 330 pS node to IU Vendor C Device
24
perfSONAR Embedded Systems
perfSONAR Active and C Routing
perfSONAR Embedded Systems: POC Architecture
25
C
C C A J
perfSONAR Embedded Systems
Manufacturing Collaboration Vendor C Architecture
26
27
perfSONAR Embedded Systems
Manufacturing Collaboration Vendor C Observations● Non-standard container
○ Derived from Docker container
● Non-trivial container/switch config
● Container sometimes needs restarting
● Missing the option to bridge to SVI
○ Currently all configs need extra cable
Start of production pilot soon @ U-M
perfSONAR Embedded Systems
Manufacturing Collaboration Vendor J Architecture● perfSONAR demo at SC17
○ Ran on KVM
○ Management Port Networking
● Non-EVO experiments (2018)○ Learning experience○ Container Networking
Limitations prevented Successful pilot program
28
Vendor J History
perfSONAR Embedded Systems
Manufacturing Collaboration Vendor J ObservationsCurrent Issues● Complicated container process
○ LXC with distrobuilder, chroot, no systemd, etc.○ Currently Ubuntu only○ Privileged vs unprivileged container security model ○ Requires perfSONAR REST interface re-development
● EVO on 5200○ Experimental Release, Only in J’s Lab
29
30
perfSONAR Embedded Systems
Manufacturing Collaboration Vendor A Architecture
● Container Architecture natively supported (Docker)
● Container not intended to be attached to external network (requires workaround)
● Disk space issues
31
perfSONAR Embedded Systems
Manufacturing Collaboration Vendor AObservations
● Container networking issues
○ Forced to use Docker “Host” Networking
○ Not feasible in production
○ Feature Request for supported “bridge” container networking
● Standard Docker container support
● Disk / partition space issues
○ Advanced Docker config required
perfSONAR Embedded Systems
Next Steps● Finish deployment and testing with
Company A and J
● Enhance documentation
● Automate container creation
● Explore container security architecture
● Further testing○ Add traffic, more hosts○ Inclusion in test meshes○ BGP Storm Stress Test
■ Company J idea
INTERNET
DC MACCTOR
32
perfSONAR Embedded Systems
Lessons Learned: Staff Proficiency Necessary Breadth and Depth of Skills● perfSONAR Administrator
○ deploy / maintain perfSONAR nodes & meshes
● Network Administrator○ Switch configuration, basic container tshoot
○ Network Architecture
● System Administrator○ Operating System Expertise: UNIX○ Containers: Docker, LXC, proprietary, etc. ○ NTP / Security / System Tuning
System Administration
Network Administration
33
perfSONAR Embedded Systems
Advice for Next Generation Appliances ● Throughput speeds = max line speed
○ Faster / parallel bus?○ Application specific ports?
○ Dedicated ASIC in Data Plane?
● Implement container architecture○ Docker, LXC, etc.
● Simple & powerful container networking○ Reduce configuration complexity○ Route through any data port (“wireless”)
34
perfSONAR Embedded Systems
Next Steps● Deploy perfSONAR in Umich Lab
○ EVO on 5200
● Enhance documentation
● Automate container creation
● Explore container security architecture
● Further testing○ Add traffic, more hosts○ Inclusion in test meshes○ BGP Storm Torture Test
INTERNET
DC MACCTOR
35
perfSONAR Embedded Systems
CreditsUniversity of Michigan
Eric Boyd, Phil Camp, Ed Colone, Dan Eklund, Brady Farver, Ryan Goniwiecha, Amy Liebowitz, John Simpkins, Katarina Thomas
Internet2
Mark Feit
Indiana University
Dan Doyle, Michael Johnson
Jisc
Tim Chown, Raul Lopes
Vendors
Numerous engineers from multiple teams from multiple vendors
36