DC Compute SANCarlos Lopez CCIE SAN, DC #21063
David Kester CCIE SAN #19555
Ed Mazurek CCIE SNA/IP, SAN #6448
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 2TAC-Time
Introductions• Carlos Lopez
• CCIE SAN, DC #21063
• Technical Leader TAC
• Ed Mazurek
• CCIE SNA/IP, SAN #6448
• Technical Leader TAC
• David Kester
• CCIE SAN #19555
• Team Leader Storage Networking
• Components
• Topology
• Nexus FC NPV vs FCoE-NPV
• Bugs
• MDS Slow Drain Troubleshooting Enhancements
Agenda
Components
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Cisco Multi-Protocol Product Portfolio: SAN, LAN, and Compute
12+ Years of Proven NX-OS Operating System Cisco Prime Data Center Network Manager (DCNM)
Consistent, Simplified Features, Management, and Programmability
Cisco MDS9700
48x16G Line-Rate FC
LAN/SAN SAN 16G COMPUTE
Cisco UCS C-SeriesRack Servers
Cisco UCS B-SeriesBlade Servers
Cisco UCS 6200 Series FI
Cisco Nexus 9000Cisco Nexus 7000
Cisco Nexus 5600Cisco Nexus
5500
CiscoNexus 3000
CiscoNexus 2000
Cisco MDS9250i
Cisco MDS 9148S
48x10GE Line-Rate FCoE
Cisco MDS9396S
Nexus 5672UP-16G
Cisco MDS9718
24x40GELine-Rate FCoE
16G FC: Nexus 2348UPQ
Cisco UCS 6300 Series FI
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Cisco MDS 9000 Series 16G FC Director Switches
Cisco MDS 9706Director
Cisco 48x16G Line-rate FC Module
Cisco 48x10G Line-rate FCoE Module
Driving Innovations for the Next Decade with a complete 16G PortfolioDeploy Small, Medium, Large SANs with Cisco MDS 9000 Family
Cisco MDS 9710Director
Future Proof Reliable Multi-Protocol Flexibility Investment Protection Ease of Management
Cisco MDS 9718Director Cisco 24x40G FCoE
Line-rate FCoE Module
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Cisco MDS 9000 Series 16G FC Fabric Switches
Driving Innovations for the Next Decade with a complete 16G PortfolioDeploy Small, Medium, Large SANs with Cisco MDS 9000 Family
Cisco MDS 9148S 16G FC Fabric Switch
Cisco MDS 9396S 16G FC Fabric Switch
Cisco MDS 9250i 16G Multi-Service Fabric Switch
Pay-as-You-Grow Enterprise Class Features Reliability Multi-Protocol Flexibility Ease of Management
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Cisco MDS 9700 Directors Comparison
9 RU
Hardware Feature MDS 9706 MDS 9710 MDS 9718Line Card slots 4 8 16Line rate port @ 16Gbps FC or 10 Gbps FCoE 192 384 768Line rate ports @ 40 Gbps FCoE 96 192 384Fabric Module slots (available / default) 6 / 3 6 / 3 6 / 6Sup Slots 2 2 2Fabric Module location Rear Rear RearAirflow Front to Back Front to Back Front to BackPower Supply slots 4 8 16Power Consumption (Typical/Max) 2425W / 2620W 4615W / 5020W 4742/8462W
14 RU
26 RU
Winning Points
• 32G FC line-rate ready• Interchangeable line cards• Redundant hardware • Common PSUs, Linecards• Single OS, Management• Better UCS Interoperability
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Converged FEX ArchitectureEnabling Director-class resiliency at Converged Access
• Multi-Protocol Storage and Host Connectivity: FC, FCoE and IP
• Converged Architecture includes Cisco Data center portfolio: SAN(MDS 9700), LAN(Nexus 2k-7k) and Compute (UCS Chassis) accessibility
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
MDS 9250i Multiservice Fabric Switch
FeaturesMulti-Protocol Support• 16G FC, 10GE FCoE, 1GE/10GE FCIP, iSCSIIntelligent Storage Services for FC and FCoE SANs• Fiber Channel over IP (FCIP)• IO Accelerator (IOA)• Data Mobility Migration (DMM)• Integrated Management via Data Center Network
Manager (DCNM)FICON Certified
BenefitsSingle Platform for deploying Storage Services across FC, FCoE and IP based Storage Area Networks (SANs)
• High-Bandwidth SAN Extension across MAN/WAN• Vendor independent array migration tools • Interoperate data between FC and FCoE arrays
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
MDS 9250i Overview• Next Generation Multiservice Intelligent Services-oriented Fabric Switch
• Provides FCIP, IOA and DMM
• Integrated 40x16G FC, 8x10GE FCoE, and 2x1GE/10GE FCIP/iSCSI ports
• Enclosure: 2 RU; Redundant and hot-swappable power supplies and fan trays
Console
USBMgmt0
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
High-Performance, Easy to Deploy, Enterprise-class Fabric Switch
Cisco MDS 9148S Fabric Switch
VERSATILE EASY TO USE ENTERPRISE-CLASS• Line-rate 16/8/4/2G FC Ports• Industry-leading port range
Start with 12-port baseScale up with 12-port licenseOr, full 48-port option available
• Automated Provisioning• Quick Configuration Wizard• Same OS and Management across
Industry’s broadest SAN Portfolio
• Non-disruptive software upgrades• Up to 32 Virtual SANs (VSANs)• Inter-VSAN Routing (IVR), QOS,
PortChannels, N-Port ID Virtualization (NPIV), N-Port Virtualization (NPV), Comprehensive Security
• Hardware-based slow-drain detection and recovery
Back
Dual Power Supplies and Fans for Enterprise-Class Availability
Front
48 x 16G FC Line Rate Performance12- to 48-ports in 12-port increments
1 RU
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Introducing MDS 9396S 96-Port 16G Fabric Switch
Cisco MDS 9396S
Versatile • Start with 48-port base; Scale up with 12-port license Or full 96-port option available
Easy to Use• Automated provisioning• Quick Configuration Wizard• Same OS and management across industry’s broadest SAN portfolio
Enterprise-Class
• Dual-power supplies and fans, non-disruptive software upgrades• Up to 4095 B2B credits per port (MDS 9396S); up to 253 B2B credits per port (MDS 9148S)• Up to 32 Virtual SAN (VSANs)• Hardware-based slow-drain detection and recovery, Inter-VSAN routing, QoS,
PortChannels, N-Port ID Virtualization (NPIV), N-Port Virtualization (NPV) • Forward Error Correction, Link Encryption (FC TrustSec)
Industry’s Most Affordable 16G Fabric Switch Family
2 RU
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
All ports in a port group can have maximum of 500 B2B credits
Enterprise license enables extended credits that means up to 4095 B2B credit per port in a port group
Port group can have maximum of 4150 B2B credits.
Best Practice: Avoid grouping all E ports in same port group/IOSlice Generic Formula: For every 1 KM distance with 1GB speed, we need .5 BB
credit for standard FC frame (2112 bytes). CLI command: show port-resources module 1
MDS 9396S B2B credits
Topologies – UCS-FI N5K MDS
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Separation makes sure that the design is highly available even when one of the fabrics goes down.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Redundant redundancy is not required
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Working with TAC – Topology Commands
`show topology`
FC Topology for VSAN 1 :
Interface Peer Domain Peer Interface Peer IP Address(Switch Name)
---------------------------------------------------------------------
port-channel 6 0x62(98) port-channel 6 10.10.10.2 (sw201A)
20
Use the ‘show topology’ command to display the interswitch links
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Working with TAC – Topology Commands
Core# show fcs ie
IE List for VSAN: 1----------------------------------------------------------------------------
IE-WWN IE Mgmt-Id Mgmt-Addr (Switch-name)
----------------------------------------------------------------------------
20:01:00:0d:ec:39:19:c1 S(Rem) 0xfffc0e 10.10.10.1 (sw204A)
20:01:00:0d:ec:39:1a:01 S(Rem) 0xfffc03 10.10.10.9 (sw202A)
20:01:00:0d:ec:fb:88:41 S(Loc) 0xfffc65 10.10.10.3 (sw200A)
20:01:00:2a:6a:8c:0b:01 S(Adj) 0xfffc62 10.10.10.2 (sw201A)
Loc = Local = this switchAdj = Adjacent = connected switchRem = Remote = more than one hop away 21
Use the ‘show fcs ie’ command to display all the switches in the VSAN
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Core# show fcns database npv
------------------------VSAN 1------------------------NPV NODE-NAME :20:01:00:0d:ec:51:06:01NPV IP_ADDR :14.16.134.192NPV INTERFACE :port-channel 30CORE SWITCH WWN :20:00:00:0d:ec:24:ef:c0CORE INTERFACE :Po105
Working with TAC –Topology CommandsUse the show fcns database npv command to find Cisco NPV switches.
22
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Working with TAC –Topology CommandsUse the show npv flogi-table command to shows where the device is attached and it’s uplink.
23
NPV# show npv flogi-table--------------------------------------------------------------------------------SERVER EXTERNALINTERFACE VSAN FCID PORT NAME NODE NAME INTERFACE--------------------------------------------------------------------------------fc1/9 1905 0x0d0060 10:00:00:00:c9:71:04:4e 20:00:00:00:c9:71:04:4e Po30
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Working with TAC –Topology Commands
24
NPV# show npv internal info external-interface all | grep addr:
fabric mgmt addr: 10.17.150.20
fabric mgmt addr: 10.17.150.20
Note: Ensure both links are connected to the same upstream IP address. This tells you which upstream switch the UCS FI is connected to in case you need to connect to upstream switch to verify the connectivity.
Nexus FC NPV vs FCoE-NPV
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 26TAC-Time
Nexus FC NPV vs FCoE-NPVFC and NPV FCoE-NPV
FCoE
FC or FCoE
Nexus or MDS NPIV
N5K FCoE-NPV
FCoE FCoE Only
FCoE
FC or FCoE
Nexus or MDS NPIV
N5K NPV
FC or FCoE
FC
Both FC and FCoE
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 27TAC-Time
Nexus FC NPV vs FCoE-NPVEnabling FC and NPV
First Enable FCoE and then NPV
N5K(config)# feature fcoeFC license checked out successfully fc_plugin extracted successfully FC plugin loaded successfully FCoE manager enabled successfully FC enabled on all modules successfully
N5K(config)# feature npvVerify that boot variables are set and the changes are saved. Changing to npv mode erases the current configuration and reboots the switch in npv mode.Do you want to continue? (y/n):y
FC or FCoE
Nexus or MDS NPIV
N5K NPV
FC or FCoE
FC FCoE
Also enable fcoe qos
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 28TAC-Time
Nexus FC NPV vs FCoE-NPVEnabling FC and NPV
FC or FCoE
Nexus or MDS NPIV
N5K NPV
FC or FCoE
FC FCoE
Actually, after enabling fcoe, then NPV, What really happens…
N5K(config)# feature npvVerify that boot variables are set and the changes are saved. Changing to npv mode erases the current configuration and reboots the switch in npv mode.Do you want to continue? (y/n):y
When the switch is reloaded in the NPV mode, Some configuration is saved:
switchnamemanagement ip configuration and vrfboot variableusername / password detailsntp configurationcallhome configurationsnmp-server detailsfeature fcoe
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 29TAC-Time
Nexus FC NPV vs FCoE-NPVEnabling FCoE-NPV
FC or FCoE
Nexus or MDS NPIV
N5K FCoE-NPV
FCoE
FCoE
Enable fcoe-npv
n5k(config)# feature fcoe-npvFCoE NPV license checked out successfully fc_plugin extracted successfully FC plugin loaded successfully FCoE manager enabled successfully FCoE NPV enabled on all modules successfully
No reload
No reconfiguration
fcoe qos still required
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 30TAC-Time
Nexus FC NPV vs FCoE-NPV ComparisonFC NPV FCoE-NPV
Protocols FC and/or FCoE FCoE
License FC_FEATURES_PKG FCOE_NPV_PKG
Commandfeature fcoefeature npv
feature fcoe-npv
Write Erase Reload feature npv no
Nexus Models N5K, N6K N5K, N6K, N9K
FCoE QoS Required Required
Disable FKA on Core for ISSU No if uplink is FC yes
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 32TAC-Time
Nested NPV – Can I connect two NPV Switches?
Cisco NPIV
CiscoNPV+NPIV
FC or FCoE
Cisco NPV
Cisco NPIV
CiscoNPV+NPIV
FC or FCoE
After enabling NPV, NPIV can also be enabled
Connecting two Cisco NPV switches is not supported
Unsupported Supported
3rd Party Vendor
Bugs
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 34TAC-Time
Frequent BugsCSCun41202 - Weak CBC mode and weak ciphers should be disabled in SSH server
Symptom: SSH servers on Cisco Nexus devices may be flagged by security scanners due to the inclusion of SSH ciphers and HMAC algorithms that are considered to be weak.
These may be identified as 'SSH Server CBC Mode Ciphers Enabled' and 'SSH Server weak MAC Algorithms Enabled' or similar.
Conditions: This issue applies to Cisco Nexus 7000, Cisco Nexus 5000 and MDS 9000 series switches. SSH functionality is enabled by default in Cisco NX-OS. The current SSH server status is displayed using the show ssh server command.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 35TAC-Time
Frequent BugsCSCun41202 - Weak CBC mode and weak ciphers should be disabled in SSH server
With the Fix: If an SSH client configured to use weak ciphers is used to log in to a Cisco device with this fix, the login may fail. The following messages are logged in the switch syslog:%DAEMON-2-SYSTEM_MSG: fatal: no matching cipher found: client 3des-cbc,blowfish-cbc server aes128-ctr,aes192-ctr,aes256-ctr - sshd
Reconfigure any SSH clients not to use weak ciphers like 3des-cbc or blowfish-cbc. DCNM uses SSH to manage Cisco devices and must be upgraded to at least 7.2(1) to work with devices with this fix.
Known-fixed-releases: 7.3(0)N1(1) 7.2(1)N1(1) 7.2(0)N1(1) 6.2(11c) 5.2(8g)
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 36TAC-Time
Frequent BugsCSCue79881 - SNMP crashes on SNMP bulk get query
Symptom: The SNMP process may crash with the following messages displayed in the output of show logging log%KERN-2-SYSTEM_MSG: mts_is_q_space_available_new():1416:Total mtsbufsize 10070872 for sap 28, exceeds limit 15 perc of 67108864 - kernel
%KERN-2-SYSTEM_MSG: mts_acquire_q_space() failing - no space in sap 28, uuid 26 send_opc 3176, pid 3616, proc_name sctpt_rx_thr - kernel
%KERN-2-SYSTEM_MSG: [sap 28][pid 4406][comm:snmpd] sap recovering failed and so Killed - kernel
%SYSMGR-2-SERVICE_CRASHED: Service "snmpd" (PID 4406) hasn't caught signal 6 (core will be saved)
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 37TAC-Time
Frequent BugsCSCue79881 - SNMP crashes on SNMP bulk get query
Conditions: This bug affects both Nexus and MDS switches. It has been observed when a monitoring device is using snmp-bulk-get requests on the entity-MIB for multiple FEX modules at one time, or if there is continuous polling from multiple polling stations on slow mibs.
Some examples of mibs that may be affected by continuous snmp bulk walk are: qos mibentity mibentity-fru mibbridge mib
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 38TAC-Time
Frequent BugsCSCue79881 - SNMP crashes on SNMP bulk get query
Workaround: A possible workaround is configuring the no snmp-server counter cache enable command. This command prevents SNMP bulk gets from getting cached via the use of MTS buffers. This will prevent the MTS buffers from getting consumed and resulting in a process crash. The result of the command is that the interface table might be slower to update the statistics (since caching is disabled).
• Note: This workaround is only available on Nexus 7000 switches.
This defect is fixed in NX-OS releases 5.2.9, 6.1.4 (Nexus 7000), 5.2.8g (MDS), and 6.2.1 (Nexus 7000 and MDS), 6.0(2)N1(2a) (Nexus 5K and 6k)
Further Problem Description: A possible way of verifying if you are affected by this bug is to issue the command show system internal mts buffers summaryand check if notifications for sap 28 are increasing.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 39TAC-Time
Frequent BugsCSCus64671 - MDS 9700 show tech detail missing some commands
Symptom: MDS 9710 and MDS 9706 show tech detail missing some commands, like 'show running-config' and 'show startup-config'.
Conditions: MDS 9710 and MDS 9706 at NX-OS 6.2(11).
Workaround: Collect
• show tech-support all along with show tech-support detail
• Fixed in NX-OS 6.2(11c) and above.
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 40TAC-Time
Frequent Nexus 5K/6K Bugs
Symptom:
In Cisco Nexus 5000 series switches, a disruptive upgrade with reason incompatible image causes the Unified Ports configured as FC ports to come up as Ethernet ports after upgrade.
However, the FC port configuration still exists in the running configuration.
Conditions:
Upgrade between any two incompatible images and the fc interfaces are unified interfaces requiring the slot and port commands,slot zport x - y mode fc
CSCuj87061 - Unified fc interfaces come up as Ethernet after disruptive upgrades
Backup the configuration before upgrading
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 41TAC-Time
Frequent Nexus 5K/6K Bugs
Proactive Workaround:Do ISSU only between compatible images. Please check the result of install command for image compatibility.
Reactive Workaround:After the disruptive ISSU between incompatible images, do the following:a. copy startup-config bootflash:b. copy running-config startup-configc. reloadAfter reload:d. copy bootflash: running-confige. copy running-config startup-configNow the device should have the same configurations as before upgrade.
CSCuj87061 - Unified fc interfaces come up as Ethernet after disruptive upgrades
Consult Release Notes for non-disruptive upgrade path
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 42TAC-Time
Frequent Nexus 5K/6K Bugs
Symptom: FC interfaces are not listed in IF-MIB snmp walk.
Device Manager is not working correctly with the Nexus 5548UP or 5596UP (GEM modules installed) when the expansion module ports are set to fibre channel mode.
Hovering over the ports with the mouse in Device Manager will display for example, "Ethernet 1/17 Status: failed".
Looking at the same ports via CLI will show that the ports are really in FC mode and not configured as Ethernet ports.
CSCup75270 - FC interfaces are not listed in IF-MIB snmp
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 43TAC-Time
Frequent Nexus 5K/6K Bugs
Conditions: Nexus 5548UP or Nexus 5596UP running NX-OS 7.0(2)N1(1) with GEM Expansion module ports configured to operate in Fibre Channel modeSome ports are in Fibre Channel mode on the base chassis.
More Info: NX-OS 7.0(1)N1(1) and all previous software versions are not affected by this defect.This is an NX-OS bug, not a Device Manager bug.
Fixed in 7.0(6)N1 and above.
CSCup75270 - FC interfaces are not listed in IF-MIB snmp
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 44TAC-Time
Nexus 5K/6K Bug
Resolution Summary:
1 Made the command “[no] trunk protocol enable“ hidden
2. Added appropriate warning message when the command is run on CLI
Fixed in 7.3(0)N1(1) 7.2(1)N1(1) 7.1(3)N1(1) 7.0(7)N1(1) 6.2(9)
5548-1(config-if)# no trunk protocol
Warning: This will globally disable the switch's ability to form any trunks and impacts existing trunk ports
Do you wish to continue(y/n)? [n]
CSCur10558 Trunk Protocol Enable does not show in running config when disabled
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 45TAC-Time
Nexus 5K/6K Bug
[no] trunk protocol is not in the running config .. …or the show tech detail.
Tip: show tech will containshow port internal info all
You should always have ...Epp state: Enabled
Port Trunking Protocol (PTP) and Port Channel Protocol (PCP) use the EPP frame.
No practical reason to disable fibre channel trunking.
CSCur10558 Trunk Protocol Enable does not show in running config when disabled
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 46TAC-Time
Cisco MDS NX-OS 7.3(0)D1(1) OUI EnhancementExample: Adding OUIs
Switch(config)# wwn oui 0x10001c
• OUI - A 24 bit globally unique number assigned by IEEE. • Port-channel functionality includes Cisco OUI check of peer switch.
MDS Slow Drain Troubleshooting Enhancements
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 48TAC-Time
MDS Slow Drain Troubleshooting Enhancements• NX-OS 6.2(9) and 6.2(13) added several enhancements
• system timeout no-credit-drop triggered at exact time by HW• TxWait• slowport-monitor• New port-monitor counters
• txwait• tx-slowport-oper-delay• tx-slowport-count
• show tech-support slowdrain
• DCNM Slow Drain Analysis
• TAC tool MDS_show_tech_slowdrain_analysis
• For a more comprehensive information see:• TAC-Time - SAN Congestion! Understanding, Troubleshooting, Mitigating in a Cisco
Fabric (2016 Las Vegas) • https://www.ciscolive.com/online/connect/sessionDetail.ww?SESSION_ID=90897&backBtn=true
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
system timeout no-credit-drop
• no-credit-drop causes frames to be dropped immediately if the destination port is at 0 Tx credits for the time specified
• Previously no-credit-drop was triggered by SW process at 100ms intervals
• NX-OS 6.2(9) and later triggered by the HW at exact time the threshold is reached
• Should be used in conjunction with lowering congestion-drop threshold
• Recommended for F ports
• Can drastically improve ISL performance under slow drain conditions
• xxx_FORCE_TIMEOUT_ON/OFF counter
• By default no-credit-drop is not enabled
Triggered by HW at exact time
49TAC-Time
system timeout no-credit-drop 200 mode f
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
TxWait enhancements
• txwait is a counter that increments every 2.5us when port is at 0 Tx credits and there are frames queued for transmit
• txwait * 2.5 / 1000000 = seconds of time the port was unable to transmit
• Only applies to the following:• MDS 9500 with generation 4 linecards:
• MDS 9000 Family 32-Port 8-Gbps Advanced Fibre Channel Switching Module (DS-X9232-256K9)• MDS 9000 Family 48-Port 8-Gbps Advanced Fibre Channel Switching Module (DS-X9248-256K9)
• MDS 9700 48-Port 16-Gbps Fibre Channel Switching Module (DS-X9448-768K9)• MDS 9148S 16G Multilayer Fabric Switch• MDS 9250i Multiservice Fabric Switch• MDS 9396S 16G Multilayer Fabric Switch
• Others will return zero
txwait
50TAC-Time
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
TxWait enhancements - continued
txwait can be seen in the following:
• show interface counters• Raw value in 2.5us units
• show interface counters • Percentage Tx credits are available for last 1s/1m/1h/72h
• show process creditmon txwait-history• 60sec, 60min, 72hour graphs
• show logging onboard txwait
• SNMP fcIfTxWaitCount variable
51TAC-Time
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
TxWait enhancements - continued
mds9710-1# show interface fc1/13 counters | i fc|waitfc1/136252650 2.5us Txwaits due to lack of transmit credits
6252650 * 2.5 / 1000000 = 15.631625 seconds
• Cumulative since the interface counters were last cleared
• The above indicates the MDS was not able to transmit for over 15 seconds since the counters were last cleared
txwait - show interface counters
52TAC-Time
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
TxWait enhancements - continued
• Utilizes the underlying txwait counter
txwait - Percentage Tx credits are available for last 1s/1m/1h/72hMDS9710-1# show interface fc1/13 countersfc1/13…5 Transmit B2B credit transitions to zero2 Receive B2B credit transitions to zero557320 2.5us TxWait due to lack of transmit creditsPercentage Tx credits not available for last 1s/1m/1h/72h: 1%/5%/3%/2%32 receive B2B credit remaining128 transmit B2B credit remaining128 low priority transmit B2B credit remaining
53TAC-Time
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Level 1: Latency - Troubleshooting
MDS9513# show logging onboard txwait module 4…---------------------------------Module: 4 txwait count
---------------------------------Notes:
- Sampling period is 20 seconds- Only txwait delta >= 100 ms are logged
-----------------------------------------------------------------------------| Interface | Delta TxWait Time | Congestion | Timestamp || | 2.5us ticks | seconds | | |-----------------------------------------------------------------------------| fc4/1 | 52927 | 0 | 0% | Wed May 27 13:20:12 2015 || fc4/1 | 2005222 | 5 | 25% | Wed May 27 13:19:52 2015 || fc4/1 | 105854 | 0 | 1% | Wed May 27 13:19:32 2015 || fc4/1 | 52926 | 0 | 0% | Wed May 27 13:19:12 2015 |
• Delta values recorded when they are more than 100ms in the 20 second interval
txwait - show logging onboard txwait
Recorded every 20
seconds only when >= 100ms
TAC-Time 54
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Level 1: Latency - Troubleshooting
• Graphical display of time where Tx credits are not available
• Similar in format to cpu history
• 3 graphs per port• Last 60 seconds• Last 60 minutes• Last 72 hours
• Utilizes the underlying txwaitcounter
txwait-history mds9710-1# show process creditmon txwait-history module 1 port 13
TxWait history for port fc1/13:==============================
697 54 6994299 18 4780
0000000000000000000000000000000000290002900884000000000000001000 # ##900 # ##800 ## ##700 ## ##600 ### ###500 ### ## ###400 ### ## ####300 ### ## ####200 ### ## ####100 ### ## ####
0....5....1....1....2....2....3....3....4....4....5....5....60 5 0 5 0 5 0 5 0 5 0
Credit Not Available per second (last 60 seconds)# = TxWait (ms)
55TAC-Time
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Level 1: Latency - Troubleshooting
• system timeout slowport-monitor <1-500> mode e|f – Must be configured!
• Events are captured every 100ms
• Last 10 events per port captured in slowport-monitor-events
• Logging onboard slowport-monitor-events captures more events
• Currently implemented for: • 9500 - Gen 3 LCs - DS-X9248-48K9 and DS-X92xx-96K9 modules • 9500 - Gen 4 LCs - DS-X9232-256K9 and DS-X9248-256K9 modules • 9700 & 9396S (Gen 5)• 9250i & 9148S
• Differences exist between Gen3, Gen4 and 9700/9250i/9148S/9396S
slowport-monitor
56TAC-Time
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Level 1: Latency - Troubleshooting
• Gen5/9250i/9148S/9396S have enhanced HW capabilities
• Each 100ms interval the number of times Tx credits remained at 0 for the configured(admin) delay is counted.
• The average operational delay is determined – This is how long the port was at 0 Tx credits
• Recorded when at least one complete event occurred
• More events available via logging onboard slowport-monitor-events
slowport-monitor – 9700/9250i/9148S/9396S (Gen 5 LCs)MDS9710-1# show process creditmon slowport-monitor-events
Module: 01 Slowport Detected: YES=========================================================================Interface = fc1/13----------------------------------------------------------------| admin | slowport | oper | Timestamp || delay | detection | delay | || (ms) | count | (ms) | |----------------------------------------------------------------| 5 | 1300 | 20 | 1. 04/01/15 23:03:38.823 || 5 | 1296 | 19 | 2. 04/01/15 23:03:38.724 || 5 | 1291 | 19 | 3. 04/01/15 23:03:38.623 |…| 5 | 1256 | 19 |10. 04/01/15 23:03:37.923 |----------------------------------------------------------------
Configured delay(5ms)
Actual average delay
4 events in last 100msNote: Oper delay limited by no-credit-drop threshold
TAC-Time 57
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 58TAC-Time
Slow Drain Alerting and Mitigation
• Port-monitor allows monitoring of several counters relating to slow drain• credit-loss-reco Credit loss recovery counter• lr-rx The number of link resets received by the fc-port• lr-tx Link resets transmitted by the fc-port• timeout-discards Timeout discards counter• tx-credit-not-available Credit not available counter(in 100ms increments)• tx-discards Tx discards counter• tx-slowport-count Number of slowport events• tx-slowport-oper-delay Slowport operational delay• txwait Amount of time at 0 Tx credits and packets queued• rx-datarate Rx data rate as a percentage of link speed• tx-datarate Tx data rate as a percentage of link speed
Port-monitor alerting
Note: There are other counters that are valuable and should also be considered for inclusion in monitoring but are not part of slow drain
New in 6.2(13)
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Troubleshooting – Documentation for TAC
• Contains all the commands available that pertain to slow drain
• Contains “context” commands to understand the FC topology
• Contains name server commands to identify devices
• Contains active zonesets to understand device relationships
• Most useful when run from DCNM and gathered for the entire fabric• SAN Client -> Tools -> Run CLI Commands…
• When opening up a case with the TAC please have this available!
• Used for MDS_show_tech_slowdrain_analysis
show tech-support slowdrain
59TAC-Time
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
DCNM Slow Drain Analysis
• DCNM 7.1(1) added Slow Drain Analysis
• DCNM 7.2(2) added improvements
• DCNM 10.0(1) added improvements
• Used for pulling fabric wide slow drain counters for a defined period of time
• Useful for ongoing slow drain problems
• Accessed from the Web Client Health -> Diagnostics -> Slow drain Analysis
60TAC-Time
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 61TAC-Time
MDS_show_tech_slowdrain_analysis
Yellow indicates level
1 (latency)
Orange indicates level
2 (timeout drops)
Arrows indicate
direction of congestion
Green indicates
no congestion
Slow draining end device!
Red indicates level 3 (credit-
loss)
Thank you