cisco fabricpath introduction
TRANSCRIPT
5/25/11
1
Cisco FabricPath Introduction Marian Klas, [email protected]
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 2
Agenda
FabricPath introduction
FabricPath technical review
FabricPath designs considerations
FabricPath value proposition
5/25/11
2
3 © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public
FabricPath Introduction
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
VLAN VLAN
VLAN VLAN
Access
Core
Eternal Debates on Network Design Layer 2 or Layer 3? Both Layer 2 and Layer 3 are required for any network design
Cisco has solutions for both Layer 2 and Layer 3 to satisfy
Customers’ requirements
Layer 3 Network
VLAN VLAN
VLAN VLAN
L3 L2
Simplicity (no planning/configuration required for either addressing or control plane) Single control plane protocol for unicast, broadcast, and multicast Easy application development
Subnet provide fault isolation Scalable control planes with inherent provision of multi-pathing and multi-topology HA with fast convergence Additional loop-mitigation mechanism in the data plane (e.g. TTL, RPF check, etc.)
Layer 2?
Layer 3?
5/25/11
3
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
L2 Network Requirements inside DC
Maximize Bi-Sectional Bandwidth
Scalable Layer 2 domain
High Availability Resilient control-plane Fast convergence upon failure Fault-domain isolation
Facilitate Application Deployment Workload mobility, Clustering, etc.
Multi-Pathing/Multi-Topology
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
Constraints for Scaling Layer 2 Network
VPC domain
Aggregation
Access
Data Center Core B
L
R
N
E
BPDUguard
Loopguard Rootguard
Network port Edge port
- Normal port type
B
R R
N N
- - - - - - R R R R R R
- -
B E
B E
B E
B E E
- -
-
L
Layer 3
Layer 2 Secondary
Root
HSRP Standby
Primary Root
HSRP Active
MAC Table Size
X-Chassis Port-Channel
Port Density on Switches
Over-subscription Ratio
Complex STP Configuration
Primary vPC
Secondary vPC
5/25/11
4
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Existing Options for Layer 2 Expansion
Higher Port-Density
More I/O slots
More ports per I/O podule
Port-Channel/Link-Aggregation
More ports in a bundle (up to 16-port today)
Multiple Inter-Switch Links
STP only allows single active link between 2 devices
Higher Interface Speed
Use interface with speed equals to the combination of multiple lower-speed links
Wasted Bandwidth Higher Port Cost Limited Scale
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
Spanning Tree and Over-subscription
Branches of trees never interconnect (no loop!!!)
Spanning Tree Protocol (STP) uses the same approach to build loop-free L2 logical topology
Over-subscription ratio exacerbated by STP algorithm
11 Physical Links
5 Logical Links
5/25/11
5
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Limitations of Spanning Tree Protocol
Sub-optimal path selection Single path between any 2 bridges in the same L2 network Shortest path only from Root Bridge’s perspective
Under-utilized bandwidth Ensure loop-free L2 logical topologies by blocking redundant links Increased waste of available bandwidth as link-speed getting faster and faster
No control plane security Root election purely based on switch-ID, which is prone to problems caused by operator errors
Slow and unreliable reconvergence upon link failure Up to seconds of service disruption even with RSTP
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
Is Over-subscription Acceptable?
Mostly North-South traffic flows
Over-subscription acceptable for client-server type of applications
Campus Network Data Center
Mix of North-South and East-West traffic flows
Often demands special design consideration to minimize bandwidth limitation imposed by over-subscription
5/25/11
6
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Nature of Layer 2 Bridging
Transparent – act like “shared media” to end devices Plug-N-Play – No user configuration is required to build
forwarding database Data plane learning – Forwarding database built based on frame
contents Flooding – Default forwarding behavior for frames with unknown
unicast destination is to flood the whole broadcast domain Every MAC, Everywhere!!! – All unicast MACs need be learn by
all bridges in the same bridge domain to minimize flooding
MAC Table
A
MAC Table
A
MAC Table
A
MAC Table
A
MAC Table
A
MAC Table
A
Layer 2 Domain
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
Network Addressing Scheme MAC v.s. IP
10.0.0.10 /24
Network Address 10.0.0.0/24
Host Address 10.0.0.10
0011.1111.1111 Non-hierarchical
Address
L2 Forwarding (Bridging) Data-plane learning Flat address space and forwarding table (MAC everywhere!!!) Flooding required for unknown unicast destination Destination MACs need to be known for all switches in the same network to avoid flooding
0011.1111.1111 0011.1111.1111
0011.1111.1111
0011.1111.1111 0011.1111.1111
L3 Forwarding (Routing) Control-plane learning Hierarchical address space and forwarding Only forwarding to destination addresses with matching routes in the table Flooding is isolated within subnets No dependence on data-plane for maintaining forwarding table
10.0.0.10 20.0.0.20
10.0.0.0/24
10.0.0.0/16 20.0.0.0/16
20.0.0.0/24
5/25/11
7
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
The Next Era of Layer 2 Network What Can Be Improved?
Network Address Scheme: Flat Hierarchical Additional header is required to allow L2 “Routing” instead of “Bridging” Provide additional loop-prevention mechanism like TTL
Address Learning: Data Plane Control Plane Eliminate the needs to program all MACs on every switches to avoid flooding
Control Plane: Distance-Vector Link-State Improve scalability, minimize convergence time, and allow multipathing inherently
The ultimate solution needs to take both control and data plane into consideration this time!!!
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Layer 3 strengths Leverage bandwidth Fast convergence Highly scalable
Introducing Cisco FabricPath An NX-OS Innovation for Layer 2 Networks
Simplicity Flexibility Bandwidth Availability Cost
Layer 2 strengths Simple configuration Flexible provisioning Low cost
Perform
ance Scale
Sim
plic
ity
Resilience
Flex
ibilit
y Fabric Path
5/25/11
8
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 15 Cisco Nexus Platform
Cisco NX-OS
Cisco FabricPath Overview
FabricPath encapsulation No MAC learning via flooding Routing, not bridging Built-in loop-mitigation
Time-to-Live (TTL) RPF Check
Data Plane Innovation
Plug-n-Play Layer 2 IS-IS Support unicast and multicast Fast, efficient, and scalable Equal Cost Multipathing (ECMP) VLAN and Multicast Pruning
Control Plane Innovation
Cisco FabricPath
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Update on TRILL standard status
Q: What’s the latest status of the TRILL standard? A: TRILL now officially moved from Draft to
Proposed Standard in IETF RBridges: Base Protocol Specification (draft-ietf-trill-rbridge-protocol-16) – Data plane, frame formats, learning, etc.
http://datatracker.ietf.org/doc/draft-ietf-trill-rbridge-protocol/ RBridges: Adjacency (draft-ietf-trill-adj-07) – IS-IS over shared media
http://datatracker.ietf.org/doc/draft-ietf-trill-adj/ TRILL Use of IS-IS (draft-ietf-isis-trill-05) – TRILL IS-IS TLV encodings
http://datatracker.ietf.org/doc/draft-ietf-isis-trill/
Extensions to IS-IS for Layer-2 Systems (RFC 6165) – Architecture of IS-IS for L2 networks
http://datatracker.ietf.org/doc/rfc6165/
Proposed Standard status means vendors can confidently begin developing TRILL compliant software implementations
https://datatracker.ietf.org/wg/trill/
5/25/11
9
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Standards-Based No proprietary lock-in
Current Cisco implementation based on existing standards and modeled on proposed standards
Easily migrated to industry standards in future Hardware already capable, software load will provide standards-based control-plane
FabricPath
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Nexus 7000 F-Series Module First FabricPath-capable hardware platform from Cisco
“The F-Series modules on the Cisco Nexus 7000 series are currently deployed in LLNL’s high performance computing infrastructure, offering us a high density 10GE and low latency networking solution. This technology has enabled LLNL to build large storage network fabrics to support the world class supercomputing systems vital to the laboratory's national security research and development missions”
Matt Leininger, Deputy for Advanced Technology
Projects at Lawrence Livermore National Laboratory
Scalable to 512 ports per system High-performance 320/230 Gbps (switching/ backplane), 5µs latency Investment Protection Seamless Upgrade and Interoperability Standards Based TRILL and DCB support Flexible 1/10G ports auto-sensing Energy Efficient ~10W per 10GbE port
5/25/11
10
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Cisco FabricPath enables faster, simpler, flatter data center networks
“We assessed FabricPath in terms of its ability to boost bandwidth, reroute around trouble, and simplify network management. In all three areas, FabricPath delivered”
“The switches forwarded all traffic with zero frame loss, validating FabricPath's ability to load-share across 16 redundant connections.”
“FabricPath converges far faster than spanning tree.”
“There's no question it represents a significant advancement in the state of the networking art”
http://www.networkworld.com/reviews/2010/102510-cisco-fabricpath-test.html
23 © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public
FabricPath Technical Review
5/25/11
11
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
FabricPath IS-IS
FabricPath IS-IS replaces STP as control-plane protocol in FabricPath network
Introduces link-state protocol with support for ECMP for Layer 2 forwarding
Exchanges reachability of Switch IDs and builds forwarding trees
Improves failure detection, network reconvergence, and high availability
Minimal IS-IS knowledge required –no user configuration by default
Maintains plug-and-play nature of Layer 2
STP FabricPath
STP BPDU FabricPath IS-IS STP BPDU
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
Why IS-IS?
A few key reasons: Has no IP dependency – no need for IP
reachability in order to form adjacency between devices
Easily extensible – Using custom TLVs, IS-IS devices can exchange information about virtually anything
Provides SPF routing – Excellent topology building and reconvergence characteristics
5/25/11
12
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
FabricPath versus Classic Ethernet Interfaces
STP FabricPath
Classic Ethernet (CE) Interface Interfaces connected to existing NICs and traditional
network devices Send/receive traffic in 802.3 Ethernet frame format Participate in STP domain Forwarding based on MAC table
FabricPath Interface Interfaces connected to another FabricPath device Send/receive traffic with FabricPath header No spanning tree!!! No MAC learning Exchange topology info through L2 ISIS adjacency Forwarding based on ‘Switch ID Table’
Ethernet Ethernet FabricPath Header
→ FabricPath interface → CE interface
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
FabricPath versus CE VLANs
In FabricPath system, each VLAN identified as either a CE VLAN (default) or a FabricPath VLAN
Only traffic in FabricPath VLANs can traverse FabricPath domain
Bridging between M1 and F1 ports possible only on CE VLANs
VLAN Mode
n7k(config)# vlan 10 n7k(config-vlan)# mode ? ce Classical Ethernet VLAN mode fabricpath FabricPath VLAN mode n7k(config-vlan)# mode
CE VLAN
M1 Ports
F1 Ports
FabricPath VLAN
F1 Ports
5/25/11
13
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
Basic FabricPath Data Plane Operation
Ingress FabricPath switch determines destination Switch ID and imposes FabricPath header
Destination Switch ID used to make routing decisions through FabricPath core
No MAC learning or lookups required inside core
Egress FabricPath switch removes FabricPath header and forwards to CE
STP
FabricPath Core
→ FabricPath interface → CE interface
STP
MAC A MAC B
S10 S20
DMAC→B
SMAC→A
Payload
DMAC→B
SMAC→A
Payload
Ingress FabricPath Switch
Egress FabricPath Switch
DMAC→B
SMAC→A
Payload
DSID→20
SSID→10
DMAC→B
SMAC→A
Payload
DSID→20
SSID→10
DMAC→B
SMAC→A
Payload
DMAC→B
SMAC→A
Payload
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
Cisco FabricPath Frame
Classical Ethernet Frame
FabricPath Encapsulation 16-Byte MAC-in-MAC Header
Switch ID – Unique number identifying each FabricPath switch Sub-Switch ID – Identifies devices/hosts connected via VPC+ Port ID – Identifies the destination or source interface Ftag (Forwarding tag) – Unique number identifying topology and/or
multidestination distribution tree TTL – Decremented at each switch hop to prevent frames looping infinitely
DMAC SMAC 802.1Q Etype CRC Payload
DMAC SMAC 802.1Q Etype Payload CRC (new)
FP Tag (32)
Outer SA (48)
Outer DA (48)
Endnode ID (5:0)
Endnode ID (7:6)
U/L
I/G
RSVD
O
OO
/DL
Etype
6 bits 1 1 2 bits 1 1 12 bits 8 bits 16 bits 10 bits 6 bits 16 bits
Switch ID Sub Switch ID Ftag TTL Port ID
Original CE Frame
5/25/11
14
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
FabricPath MAC Table Edge switches maintain both MAC address table and Switch ID table
Ingress switch uses MAC table to determine destination Switch ID
Egress switch uses MAC table (optionally) to determine output switchport
Local MACs point to switchports
Remote MACs point to Switch IDs
S10 S20 S30 S40
S100 S101 S200 FabricPath
MAC A MAC C MAC D MAC B
FabricPath MAC Table on S100 MAC IF/SID
A e1/1
B e1/2
C S101
D S200
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
S10 S20 S30 S40
S100 S200 FabricPath
po1 po2 po3 po4
A B
show mac address-table dynamic
S100# sh mac address-table dynamic Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID ---------+-----------------+--------+---------+------+----+------------------ * 10 0000.0000.0001 dynamic 0 F F Eth1/15 * 10 0000.0000.0002 dynamic 0 F F Eth1/15 * 10 0000.0000.0003 dynamic 0 F F Eth1/15 * 10 0000.0000.0004 dynamic 0 F F Eth1/15 * 10 0000.0000.0005 dynamic 0 F F Eth1/15 * 10 0000.0000.0006 dynamic 0 F F Eth1/15 * 10 0000.0000.0007 dynamic 0 F F Eth1/15 * 10 0000.0000.0008 dynamic 0 F F Eth1/15 * 10 0000.0000.0009 dynamic 0 F F Eth1/15 * 10 0000.0000.000a dynamic 0 F F Eth1/15 10 0000.0000.000b dynamic 0 F F 200.0.30 10 0000.0000.000c dynamic 0 F F 200.0.30 10 0000.0000.000d dynamic 0 F F 200.0.30 10 0000.0000.000e dynamic 0 F F 200.0.30 10 0000.0000.000f dynamic 0 F F 200.0.30 10 0000.0000.0010 dynamic 0 F F 200.0.30 10 0000.0000.0011 dynamic 0 F F 200.0.30 10 0000.0000.0012 dynamic 0 F F 200.0.30 10 0000.0000.0013 dynamic 0 F F 200.0.30 10 0000.0000.0014 dynamic 0 F F 200.0.30 S100#
5/25/11
15
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
FabricPath Routing Table FabricPath IS-IS manages Switch ID (routing) table
All FabricPath-enabled switches automatically assigned Switch ID (no user configuration required)
Algorithm computes shortest (best) paths to each Switch ID based on link metrics
Equal-cost paths supported between FabricPath switches S10 S20 S30 S40
S100 S101 S200
FabricPath
FabricPath Routing Table on S100
Switch IF
S10 L1
S20 L2
S30 L3
S40 L4
S101 L1, L2, L3, L4
… …
S200 L1, L2, L3, L4
One ‘best’ path to S10 (via L1)
Four equal-cost paths to S101
L1 L2 L4 L3
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Building the FabricPath Routing Table
S10 S20 S30 S40
S100 S101 S200 FabricPath
MAC A MAC C MAC D MAC B
L1 L2 L4 L3
L5 L6 L7 L8
L9 L10 L11 L12
FP IS
-IS
Switch IF
S10 L1
S20 L2
S30 L3
S40 L4
S101 L1, L2, L3, L4
… …
S200 L1, L2, L3, L4
Switch IF
S20 L1,L5,L9
S30 L1,L5,L9
S40 L1,L5,L9
S100 L1
S101 L5
… …
S200 L9
Switch IF
S10 L4,L8,L12
S20 L4,L8,L12
S30 L4,L8,L12
S100 L4
S101 L8
… …
S200 L12
Switch IF
S10 L9
S20 L10
S30 L11
S40 L12
S100 L9, L10, L11, L12
S101 L9, L10, L11, L12
… …
5/25/11
16
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
show fabricpath route S100# sh fabricpath route FabricPath Unicast Route Table 'a/b/c' denotes ftag/switch-id/subswitch-id '[x/y]' denotes [admin distance/metric] ftag 0 is local ftag subswitch-id 0 is default subswitch-id FabricPath Unicast Route Table for Topology-Default 0/100/0, number of next-hops: 0 via ---- , [60/0], 5 day/s 18:38:46, local 1/10/0, number of next-hops: 1 via Po1, [115/10], 0 day/s 04:15:58, isis_l2mp-default 1/20/0, number of next-hops: 1 via Po2, [115/10], 0 day/s 04:16:05, isis_l2mp-default 1/30/0, number of next-hops: 1 via Po3, [115/10], 2 day/s 08:49:51, isis_l2mp-default 1/40/0, number of next-hops: 1 via Po4, [115/10], 2 day/s 08:47:56, isis_l2mp-default 1/200/0, number of next-hops: 4 via Po1, [115/20], 0 day/s 04:15:58, isis_l2mp-default via Po2, [115/20], 0 day/s 04:15:58, isis_l2mp-default via Po3, [115/20], 2 day/s 08:49:51, isis_l2mp-default via Po4, [115/20], 2 day/s 08:47:56, isis_l2mp-default S100#
S10 S20 S30 S40
S100 S200 FabricPath
po1 po2 po3 po4
A B
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
FabricPath ECMP When multiple forwarding paths available, path selection based on ECMP hash
function
Up to 16 next-hop interfaces for each destination Switch ID
Number of next-hops installed in U2RIB controlled by maximum-paths command under FabricPath IS-IS process (default is 16)
S1
S100
S16
5/25/11
17
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
Conversational MAC Learning
MAC learning method designed to conserve MAC table entries on FabricPath edge switches
FabricPath core switches do not learn MACs at all
Each forwarding engine distinguishes between two types of MAC entry:
Local MAC – MAC of host directly connected to forwarding engine Remote MAC – MAC of host connected to another forwarding engine or switch
Forwarding engine learns remote MAC only if bidirectional conversation occurring between local and remote MAC
MAC learning not triggered by flood frames
Conversational learning enabled in all FabricPath VLANs
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
MAC C
Conversational MAC Learning
FabricPath Core
MAC A
MAC B
FabricPath MAC Table on S100 MAC IF/SID
A e1/1 (local)
B S200 (remote)
S100
S200
S300
FabricPath MAC Table on S200 MAC IF/SID
A S100 (remote)
B e12/1(local)
C S300 (remote)
FabricPath MAC Table on S300 MAC IF/SID
B S200 (remote)
C e7/10 (local)
5/25/11
18
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
FabricPath Multidestination Trees Multidestination traffic
constrained to loop-free trees touching all FabricPath switches
Root switch assigned for each multidestination tree in FabricPath domain
Loop-free tree built from each Root and assigned a network-wide identifier (Ftag)
Support for multiple multidestination trees provides multipathing for multi-destination traffic
Two trees supported in NX-OS release 5.1
S10 S20 S30 S40
S100 S101 S200 FabricPath
Root for Tree 1
S10
S100
S101
S200
S20
S30
S40
Logical Tree 1
Root for Tree 2
S40
S100
S101
S200
S10
S20
S30
Logical Tree 2
Root Root
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
S10 S20 S30 S40
S100 S101 S200 FabricPath
Root for Tree 1
Root for Tree 2
Multidestination Trees and Role of the Ingress FabricPath Switch
Ingress FabricPath switch determines which tree to use for each flow
Other FabricPath switches forward based on tree selected by ingress switch
Broadcast and unknown unicast typically use first tree
Hash-based tree selection for multicast, with several configurable hash options
Multidestination Trees on Switch 100
Tree IF
1 L1,L2,L3,L4
2 L4
L1 L2 L4 L3
L5 L6 L7 L8
L9 L10 L11 L12
5/25/11
19
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
Putting It All Together – Host A to Host B (1) Broadcast ARP Request
S10 S20 S30 S40
S100 S101 S200 FabricPath
Root for Tree 1
Root for Tree 2
MAC A MAC B
Multidestination Trees on Switch 100
Tree IF
1 L1,L2,L3,L4
2 L4
DMAC→FF
SMAC→A
Payload
DSID→FF Ftag→1
SSID→100
Broadcast →
DMAC→FF
SMAC→A
Payload
Multidestination Trees on Switch 10
Tree IF
1 L1,L5,L9
2 L9
L1 L2 L4 L3
L5 L6 L7 L8
L9 L10 L11 L12
Ftag →
Ftag →
DMAC→FF
SMAC→A
Payload
DSID→FF Ftag→1
SSID→100
FabricPath MAC Table on S200 MAC IF/SID
Multidestination Trees on Switch 200
Tree IF
1 L9
2 L9,L10,L11,L12
FabricPath MAC Table on S100 MAC IF/SID MAC IF/SID
A e1/1 (local)
DMAC→FF
SMAC→A
Payload
Learn MACs of directly-connected devices unconditionally
Don’t learn MACs in flood frames
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Putting It All Together – Host A to Host B (2) Unicast ARP Reply
S10 S20 S30 S40
S100 S101 S200 FabricPath
MAC A MAC B
Multidestination Trees on Switch 100
Tree IF
1 L1,L2,L3,L4
2 L4
DMAC→A
SMAC→B
Payload
DSID→MC1 Ftag→1
SSID→200
Ftag →
DMAC→A
SMAC→B
Payload
Multidestination Trees on Switch 10
Tree IF
1 L1,L5,L9
2 L9
Ftag →
Unknown →
DMAC→A
SMAC→B
Payload
DSID→MC1 Ftag→1
SSID→200
FabricPath MAC Table on S200 MAC IF/SID
Multidestination Trees on Switch 200
Tree IF
1 L9
2 L9,L10,L11,L12
FabricPath MAC Table on S100 MAC IF/SID
A e1/1 (local) DMAC→A
SMAC→B
Payload
MAC IF/SID
B e12/2 (local)
A → MAC IF/SID
A e1/1 (local)
B S200 (remote)
L1 L2 L4 L3
L5 L6 L7 L8
L9 L10 L11 L12
A → If DMAC is known, then learn remote MAC
5/25/11
20
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
FabricPath MAC Table on S200 MAC IF/SID
B e12/2 (local)
FabricPath MAC Table on S100 MAC IF/SID
A e1/1 (local)
B S200 (remote)
Putting It All Together – Host A to Host B (3) Unicast Data
S10 S20 S30 S40
S100 S101 S200 FabricPath
MAC A MAC B S200 → DMAC→B
SMAC→A
Payload
L1 L2 L4 L3
L5 L6 L7 L8
L9 L10 L11 L12
S200 →
DMAC→B
SMAC→A
Payload
DSID→200 Ftag→1
SSID→100
MAC IF/SID
A S100 (remote)
B e12/2 (local)
DMAC→B
SMAC→A
Payload
B → B →
FabricPath Routing Table on S100
Switch IF
S10 L1
S20 L2
S30 L3
S40 L4
S101 L1, L2, L3, L4
… …
S200 L1, L2, L3, L4
DMAC→B
SMAC→A
Payload
DSID→200 Ftag→1
SSID→100
FabricPath Routing Table on S30
Switch IF
… …
S200 L11
FabricPath Routing Table on S200
Switch IF
… …
S200 – S200 →
Hash
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
Loop Mitigation with FabricPath
Block redundant paths to ensure loop-free topology
Frames loop indefinitely if STP failed
Could results in complete network melt-down as the result of flooding
Minimize impact of transient loop with TTL and RPF Check
STP Domain Root
L2 Fabric
S1
S10
S2
TTL=3
TTL=2 TTL=1
TTL=0
TTL is part of FabricPath header Decrement by 1 at each hop Frames are discarded when
TTL=0 RPF check for multicast based
on “tree” info
Root
M ç
S2
5/25/11
21
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
VLAN Pruning in L2 Fabric VL
10
VL20
VL
30
VL10
VL
30
VL20
L2 Fabric Shared
Broadcast Tree
L2 Fabric
VLAN 10
L2 Fabric
VLAN 20
L2 Fabric
VLAN 30
Switches indicate ‘locally interested VLANs’ to the rest of the L2 Fabric
Broadcast traffic for any VLAN only sent to switches that have requested for it
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
Introducing VPC+
VPC+ allows dual-homed connections from edge ports into FabricPath domain with active/active forwarding
CE switch, Layer 3 router, dual-homed server, etc.
VPC+ requires F1 modules with FabricPath enabled in the VDC
Peer-link and all VPC+ connections must be to F1 ports
VPC+ creates “virtual” FabricPath switch for each VPC+-attached device to allow load-balancing within FabricPath domain
F1 F1
VPC+ F1
F1 F1
S1 S2
po3
F1
F1 F1
VPC+ F1
F1 F1
S1 S2
po3
F1
Host A→S4→L1,L2 S3
Host A
Host A
L1 L2
S3
L1 L2
S4
Physical
Logical
Virtual “Switch 4” becomes next-hop for Host A in FabricPath domain
FabricPath
CE
5/25/11
22
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
VPC vs. VPC+
A given VDC can be part of VPC domain, or VPC+ domain, but not both
VPC+ only works on F1 modules with FabricPath enabled in the VDC
Conversion between VPC and VPC+ is disruptive
VPC VPC+ Peer-link M1 ports or F1 ports F1 ports Member ports M1 ports or F1 ports F1 ports VLANs CE or FabricPath
VLANs FabricPath VLANs only
Peer-link switchport mode CE trunk port FabricPath core port
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 47 MAC A
VPC+ Physical Topology
S10 S20 S30 S40
S100 S200 FabricPath
MAC B MAC C
Peer link and PKA required
Peer link runs as FabricPath core port
VPCs configured as normal
No requirements for attached devices other than channel support
VLANs must be FabricPath VLANs
5/25/11
23
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
VPC+ Logical Topology
MAC A
S10 S20 S30 S40
S100 S200 FabricPath
MAC B MAC C
S1000
Virtual switch introduced
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Remote MAC Entries for VPC+
MAC A
S10 S20 S30 S40
S100 S200 FabricPath
MAC B MAC C
S1000
S200# sh mac address-table dynamic Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID ---------+-----------------+--------+---------+------+----+------------------ * 10 0000.0000.000c dynamic 1500 F F Eth1/30 10 0000.0000.000a dynamic 1500 F F 1000.11.4513 S200#
po1 po2
1/30
5/25/11
24
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
FabricPath Routing for VPC+
MAC A
S10 S20 S30 S40
S100 S200 FabricPath
MAC B MAC C
S1000
S200# sh fabricpath route topology 0 switchid 1000 FabricPath Unicast Route Table 'a/b/c' denotes ftag/switch-id/subswitch-id '[x/y]' denotes [admin distance/metric] ftag 0 is local ftag subswitch-id 0 is default subswitch-id FabricPath Unicast Route Table for Topology-Default 1/1000/0, number of next-hops: 2 via Po1, [115/10], 0 day/s 01:09:56, isis_l2mp-default via Po2, [115/10], 0 day/s 01:09:56, isis_l2mp-default S200#
po1 po2
1/30
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
SVI SVI
VPC+ and Active/Active HSRP
With VPC+ and SVIs in mixed-chassis, HSRP Hellos sent with VPC+ virtual switch ID
FabricPath edge switches learn HSRP MAC as reached through virtual switch
Traffic destined to HSRP MAC can leverage ECMP if available
Either VPC+ peer can route traffic destined to HSRP MAC
HSRP Active HSRP Standby
MAC A
S10 S20 S30 S40
S100 S200 FabricPath
MAC B MAC C
S1000
po1 po2
1/30
DMAC→0002
SMAC→HSRP
Payload
DSID→MC
SSID→1000
5/25/11
25
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
HSRP MAC on Edge Switches
SVI SVI
HSRP Active HSRP Standby
MAC A
S10 S20 S30 S40
S100 S200 FabricPath
MAC B MAC C
S1000
po1 po2
S200# sh mac address-table dynamic address 0000.0c07.ac0a Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since last seen,+ - primary entry using vPC Peer-Link VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID ---------+-----------------+--------+---------+------+----+------------------ 10 0000.0c07.ac0a dynamic 0 F F 1000.0.1054 S200#
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
SVI SVI
Active/Active HSRP for FabricPath with VPC+
Programs gateway MAC on both active and standby devices
Requires VPC+ peer link on F1 modules
F1 F1
M1 M1 VPC+
F1 F1
HSRP
F1 F1
F1 F1
GWY MAC→router MAC
Active Standby
GWY MAC→proxy L3 port-channel
GWY MAC→router MAC
GWY MAC→proxy L3 port-channel
GWY MAC→L1,L2
S1 S2
L1 L2
GWY MAC→po3
po3
FabricPath
CE L3
5/25/11
26
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
L3
Active/Active Gateway MAC for FabricPath with VPC+
External HSRP routers connected via VPC+
Gateway MAC advertised into FabricPath domain
F1 F1
F1 F1 VPC+
F1 F1
HSRP
F1 F1
F1 F1
GWY MAC→po1
Active Standby
GWY MAC→po1
GWY MAC→L1,L2
S1 S2
L1 L2
GWY MAC→po3
po3
F1 F1
po1 po2
FabricPath
CE
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
SVI SVI
Active/Active HSRP with Service Nodes
F1 F1
M1 M1 VPC+
F1 F1
HSRP
F1 F1
F1 F1
Active Standby
S1 S2
L1
L2 po3
F1 F1
F1 F1
po1 po2
Active Services
Standby Services
FabricPath
CE
GWY→po3 Services→po3
GWY→L1,L2 Services→L1,L2
GWY→proxy L3 Services→po1
GWY→proxy L3 Services→po1
GWY→router MAC GWY→router MAC
L3
5/25/11
27
56 © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public
FabricPath Design Consideration
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
FabricPath Design Guidance
Industry has converged on a handful of well-understood designs/network topologies
Largely driven by constraints of STP, and density limits of switches
Designs will necessarily evolve Not only what can/cannot be built today versus in future, but how people think about L2 designs in general
5/25/11
28
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Key FabricPath QA-Tested Scale Numbers
FabricPath switches in topology 32 FabricPath VLANs in topology 2000 Topologies 1 Multidestination trees 2 Total edge ports 512 VPC+ port-channels 110 Multicast (IGMP snooping) entries 5K
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
Greenfield Designs
Greenfield designs fairly simple – Access layer interconnected by Spine switches
Well-suited for HPC, cluster/grid computing applications
Limited broadcast and little or no requirement for L3 Network and/or applications can be tailored to maximize throughput and minimize latency
Main consideration is how to incorporate Layer 3 for East-West/North-South traffic
5/25/11
29
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
Grid/Cluster Design
Uses FabricPath to build high-density, low-latency network with non-blocking any-to-any communication paths
Based on Day 1 limits (32 switches), can build up to 4,096 10G server domain
In theory, can build up to 8,192 10G server domain
Ideal for custom application environments with limited/no Layer 3 requirement
16 chassis
32 chassis
16-way ECMP
16-port port-channel
8192 10G servers non-blocking
256 10GE FabricPath ports 256 10GE host ports
Nexus 7018
Nexus 7018
FabricPath
Up to 160 Tbps System Bandwidth
512 10GE FabricPath ports
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 63
Alternatives for N-Way Layer 3 Egress
Various alternatives exist, depending on FHRP preference and location of L2/L3 boundary
FHRP options: HSRP/VRRP, GLBP
L2/L3 boundary: internal or external routers
5/25/11
30
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
SVI SVI SVI SVI
L3
Alternatives for N-Way Layer 3 Egress GLBP with FabricPath (Internal Routers)
Single virtual IP, multiple virtual MACs (up to 4)
Load sharing toward exit points based on which MAC each server learns through ARP
L3
GLBP
S1 S4
L1
S3 S2
L2
L4
FabricPath
CE
GWY IP X GWY MAC C
GWY IP X GWY MAC D
GWY IP X GWY MAC A
GWY IP X GWY MAC B
GWY MAC A→L1 GWY MAC B→L2 GWY MAC C→L3 GWY MAC D→L4
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
L3
Alternatives for N-Way Layer 3 Egress GLBP with FabricPath (External Routers)
Avoids M1/F1 mixed chassis, provides more FabricPath port density
L3
GLBP
S1 S4
L1
FabricPath
CE
S3 S2
L2
L4 GWY MAC A→L1 GWY MAC B→L2 GWY MAC C→L3 GWY MAC D→L4
GWY IP X GWY MAC C
GWY IP X GWY MAC D
GWY IP X GWY MAC A
GWY IP X GWY MAC B
5/25/11
31
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
L3
Alternatives for N-Way Layer 3 Egress MHSRP with FabricPath
More complex configuration, DHCP changes
But, can scale beyond four active forwarders
L3
HSRP
S1 S4
L1
FabricPath
CE
S3 S2
L2
L4 GWY MAC W→L1 GWY MAC X→L2 GWY MAC Y→L3 GWY MAC Z→L4
For VLAN n:
GWY IP Y (a) GWY IP X (s)
GWY MAC Y
GWY IP Z (a) GWY IP Y (s)
GWY MAC Z
GWY IP W (a) GWY IP Z (s)
GWY MAC W
GWY IP X (a) GWY IP W (s)
GWY MAC X
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 67
L3
L3
Alternatives for N-Way Layer 3 Egress VLAN Splitting with HSRP
Splitting by VLAN (avoids DHCP challenge of MHSRP)
Each router still has interface in all VLANs but not HSRP (or HSRP in Listen mode)
HSRP
S1 S4
L1
FabricPath
CE
S3 S2
L2
L4
VLANs w: GWY MAC W→L1 VLANs x: GWY MAC X→L2 VLANs y: GWY MAC Y→L3 VLANs z: GWY MAC Z→L4
Active VLANs Y Standby VLANs X
GWY MAC Y
Active VLANs Z Standby VLANs Y
GWY MAC Z
Active VLANs W Standby VLANs Z GWY MAC W
Active VLANs X Standby VLANs W
GWY MAC X
5/25/11
32
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
L3
Alternatives for N-Way Layer 3 Egress VLAN Splitting with Active/Active HSRP in VPC+
Leverages benefit of VPC+ active/active HSRP
Each router still has interface in all VLANs but not running HSRP
Does require PL/PKA, and mixed chassis
S1 S4
L1
FabricPath
CE
S3 S2
L2
L4
VLANs x: GWY MAC X→L1, L2 VLANs y: GWY MAC Y→L3, L4
VPC+ VPC+
HSRP HSRP Active/Active HSRP for VLANs X GWY MAC X
L3
Active/Active HSRP for VLANs Y GWY MAC Y
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
L3
SVI SVI Standby
SVI SVI
SVI SVI
L3 Egress 3 L3 Egress 4 L3 Egress 1 L3 Egress 2
FabricPath Core with L3 Access Scales L3 at the edge
Can extend VLANs through FabricPath backbone (no hard requirement to terminate L3 at edge VPC+ peers)
VLANs still have “affinity” to L3 access pair
OSPF etc.
S1 S4
FabricPath
CE
S3 S2
VPC+ VPC+ VPC+
HSRP
Active Standby
OSPF etc.
Active
HSRP HSRP
OSPF
Can extend some or all VLANs into FabricPath core
Requires FabricPath and L3 support on 5500
5/25/11
33
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
… … …
Integrating FEX with FabricPath FabricPath
CE
L3
With F1, requires VDCs with external cross-links (M1 ports cannot belong to FabricPath VLANs)
F1
M1 M1 M1 M1 M1 M1
F1
F1
F1
F1
F1
F1
F1 Use F1 ports for
VDC interconnect
VLANs in CE mode in FEX VDC
Same VLANs in FP mode in FP VDC
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
… … …
FEX with FabricPath Using F2 Modules FabricPath
CE
L3
With F2 modules, VDC requirement removed
F2
F2
F2
FEX connected directly to F2 ports
VLANs in FP mode
F2 F2 F2 F2 F2
FEX connected directly to F2 ports
5/25/11
34
80 © 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public
FabricPath Value Proposition
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
Standards-based – No proprietary lock-in
Plug-and-play – Minimal configuration/complexity
Optimal any-to-any connectivity – Connect anywhere using an arbitrary topology, fabric uses the best path
High-bandwidth – High-performance modules/platforms with ample parallel bandwidth
Resilient – Routing-like convergence
Scalable – Easily grow the network based on business requirements
Easy migration – Doesn’t follow the “rip-and-replace” model
Simple administration – Not “black-box” to network team
Benefits of a FabricPath Network Fabric
FabricPath
5/25/11
35
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Key Takeaways
FabricPath is simple, scalable, and efficient Innovations in FabricPath have potential to
change long-standing Layer 2 networking design paradigms
Hardware now shipping has FabricPath and TRILL capability
FabricPath will evolve going forward Hardware, software, and design options will only increase our flexibility and scale
© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Public 90