brocade silkworm 4100 design guide

Brocade SilkWorm 4100 Fibre Channel Switch

Design Guide

Brocade SilkWorm 4100 Design Guide

Contents

Introduction......................................................................................................................................3 Architecture of the SilkWorm 4100.................................................................................................3 4 Gbit/sec Is the Strategic Industry Direction..................................................................................4 SilkWorm 4100 SAN Design...........................................................................................................4 SilkWorm 4100 SAN Deployment ..................................................................................................5 SilkWorm 4100 SAN Management .................................................................................................6 Fabric OS 4.4 Overview ..................................................................................................................6 Summary ..........................................................................................................................................7 Appendix A: 4 Gbit/sec Fibre Channel Line Rate ...........................................................................7 Appendix B: Link Balancing and Aggregation .............................................................................10 Appendix C: Distance Extension ...................................................................................................17

2


Introduction Brocade has recently introduced its fourth-generation switch, the SilkWorm 4100. The SilkWorm

4100 is the most feature-rich and advanced switch introduced by any vendor to date. This paper describes the technical features and benefits of the SilkWorm 4100 in detail. From time to time, this paper will refer to the Brocade Design Deployment and Management guide (DDM), which can be obtained from the Brocade Partner Web site. In addition, this paper will cover the specific enhancements of the SilkWorm 4100 but will refer to the DDM for standard SAN guidelines.

Architecture of the SilkWorm 4100 The SilkWorm 4100 is powered by a single 32-port ASIC with integrated SerDes (switch on a chip)

that allows the SilkWorm 4100 to perform as a non-blocking and uncongested 32-port 4 Gbit/sec Fibre Channel switch. This means that it can support all ports running at 4 Gbit/sec full-duplex at the same time, all the time, in any traffic configuration. Because it uses an advanced central memory switching design, the SilkWorm 4100 is also free of Head-of-Line Blocking (HoLB), a performance issue that traditionally plagues crossbar systems. Expanded virtual channel capabilities also help to eliminate HoLB in larger network designs.

The SilkWorm 4100 is the first platform to ship with the Brocade “Ports-On-Demand” software architecture. This is an easy way for organizations to non-disruptively enable additional ports with a license key. The SilkWorm 4100 can be purchased in 16-port, 24-port, and 32-port versions. The base configuration has 16 ports enabled, and additional 8-port licenses can be applied until all 32 ports are enabled.

The SilkWorm 4100 is the first 4 Gbit/sec product from Brocade and is generally available in the market ahead of 4 Gbit/sec HBAs, storage devices, and switches from competitors. As a result, backward compatibility with installed base systems was a critical requirement for the ASIC and platform design: all ports are 1, 2, and 4 Gbit/sec auto-sensing (speed) and auto-negotiating (Fibre Channel port topology) to ensure seamless integration with existing environments.

Due to its highly integrated design, the SilkWorm 4100 is extremely efficient in power usage. Even with its substantial performance improvements, it uses just 75 watts compared to its predecessor (the SilkWorm 3900, which required 125 watts). In addition, the single-ASIC architecture has improved reliability metrics such as Mean Time Between Failures (MTBF) ratings.

Line rate and reliability are not the only areas of improvement. The SilkWorm 4100 has 1024 buffer–to-buffer credits shared between its 32 –ports—enabling much greater support for long-distance configurations. (For a detailed explanation of the buffer-to-buffer credit model, see “Appendix C: Distance Extension”.) Similarly, the hard zoning resources have been increased, and are now shared among all ports to allow more flexible hardware-enforced zone sets. Additional advancements to the platform include more enhanced trunking features, including 32 Gbit/sec frame-level trunks (up to eight 4 Gbit/sec links in a trunk) and Dynamic Path Selection (DPS) for exchange-level and device-level balancing between trunk groups. (See “Appendix B: Link Balancing and Aggregation” for more details on the new trunking options.) Furthermore, Brocade frame-level trunking has improved significantly, with the failure of a “master link” within a trunk group no longer being disruptive.

3


4 Gbit/sec Is the Strategic Industry Direction 4 Gbit/sec Fibre Channel technology provides immediate benefits today. Its first use will be for Inter-

Switch Links (ISLs) connecting switches together to form high-performance fabrics. The higher per-port bandwidth will allow designers to use fewer ports as ISLs to achieve the same performance goals, thereby freeing up more ports for nodes such as hosts and storage arrays. Today, this is possible for networks containing SilkWorm 4100 switches that connect to each other at 4 Gbit/sec or even faster using trunking. The SilkWorm 24000 Director is capable of receiving port cards supporting 4 Gbit/sec Fibre Channel, which will be released in the future, so it is also possible to start deploying SilkWorm 4100 switches in existing core-to-edge networks today as a future-proofing strategy.

When 4 Gbit/sec HBAs and storage devices are widely available in 2005, organizations will be able to use the higher bandwidth to satisfy the needs of their highest-performing applications and their growing IT infrastructure. They will be able to reduce the amount of time required for backups by using the newest tape technology and disk-to-disk-to-tape schemes that will keep all tape drives busy and that require more than 2 Gbit/sec of bandwidth. Organizations will also need the increased bandwidth 4 Gbit/sec Fibre Channel technology provides to keep up with the changes resulting from new regulations. Regulatory compliance is leading to an increase in the amount of data that needs to be stored and recovered, and an increase in the number of devices that must be connected. Both trends increase the bandwidth requirements of the SAN.

2 Gbit/sec Fibre Channel devices are by no means obsolete: they are still successfully solving business problems today and will continue to do so for years to come. Since 4 Gbit/sec Fibre Channel is backward compatible to 2 Gbit/sec Fibre Channel (and 1 Gbit/sec for that matter), Brocade anticipates that over time 2 Gbit/sec devices will migrate to the edge of networks or be used for applications that do not require the highest levels of performance. Since 4 Gbit/sec Fibre Channel will eventually be available on storage devices and HBAs from all major storage and system vendors, organizations can deploy the SilkWorm 4100 today and know that it will work with the devices they already have and the devices they will purchase in the future.

SilkWorm 4100 SAN Design When designing SANs with the SilkWorm 4100, organizations must consider a few special issues. For

the most part, the SilkWorm 4100 can be treated just like a SilkWorm 3900. Most exceptions relate to 4 Gbit/sec SANs consisting exclusively of SilkWorm 4100s or at least some SilkWorm 4100s connected to each other with 4 Gbit/sec ISLs. When a SilkWorm 4100 is deployed in an existing 2 Gbit/sec SAN, it will likely be deployed on the edge of an existing core-to-edge fabric. This means that it will be connected to the existing core switches, which will have 2 Gbit/sec interfaces. The same ISL over-subscription ratio should be used as with a SilkWorm 3900, because the ISLs will run at 2 Gbit/sec and all the devices will be 2 Gbit/sec. Once 4 Gbit/sec devices become available, it will be appropriate to consider adding ISLs or upgrading the core switches to 4 Gbit/sec products. Tools like Brocade Advanced Performance Monitoring or Brocade SAN Health can be used to gather statistics on ISL usage to determine if this step is necessary.

If the SilkWorm 4100 is deployed in an exclusively 4 Gbit/sec fabric, or if some part of the fabric is exclusively 4 Gbit/sec, or if it is being used for its enhanced distance extension capabilities, there are a few things to consider:

4


1. If the SAN is extended over dark fiber, xWDM, or SONET, the SilkWorm 4100 has extended distance capabilities described in “Appendix C: Distance Extension.” These capabilities can enhance performance greatly over longer distances, because the SilkWorm 4100 supports up to 255 buffer credits on a single port. This allows full-speed 1 Gbit/sec operation up to approximately 500 km, 2 Gbit/sec operation at approximately 250 km, or 4 Gbit/sec operation at approximately 100 km.

2. If the SilkWorm 4100 is part of an exclusively 4 Gbit/sec network or a certain part has 4 Gbit/sec ISLs or trunks, organizations should consider reducing the total amount of ISLs by half, assuming that all the nodes are still 2 Gbit/sec. One side note is that a minimum of two ISLs between any edge switch and the core is required to provide high availability. If the fabric is mixed speeds (has both 2 Gbit/sec and 4 Gbit/sec nodes), the ISL count will likely be something in between, depending on traffic patterns. Note that when connecting SilkWorm 4100s, the platform supports 8-way trunking and DPS between trunks. These features alone can improve performance as much as 4 Gbit/sec interfaces in many cases, and DPS in particular can operate even when connecting the SilkWorm 4100 to existing 2 Gbit/sec switches. Further information on this can be found in “Appendix B: Link Balancing and Aggregation.”

Deploying the SilkWorm 4100 will help ensure investment protection and reduce deployment costs: new SAN designs will require fewer links to achieve the same performance, which increases ROI and reduces management overhead.

SilkWorm 4100 SAN Deployment As previously described, the SilkWorm 4100 has been improved greatly over its predecessor and there

are important features to consider during the SAN design phase. Other than those new features, the SilkWorm 4100 should be deployed in much the same way as its predecessor.

Physically, the switch has changed from 1.5 rack units (U) with the SilkWorm 3900 to 1U with the SilkWorm 4100, a 50 percent improvement in rack density. The SilkWorm 4100 also utilizes octal SFP cages as opposed to single SFP cages in the SilkWorm 3900. (These are the same cages used on the SilkWorm 3250 and 3850 switches.) This also provides greater density. Since the octal cages are designed with a belly-to-belly configuration, it is still possible to extract SFPs without special tools. Airflow remains the same: both the SilkWorm 3900 and 4100 have back-to-front airflow (non-cable-side to cable-side). Power supplies and fans also remain dual-redundant and hot-pluggable, as in the SilkWorm 3900. However, the power requirements have been reduced four-fold. This results in less heat being dissipated, and cooling requirements are reduced proportionally, which is a clear advantage for organizations with power and cooling issues in high-density racks.

For 4 Gbit/sec operation, the Silkworm 4100 requires 4 Gbit/sec SFPs, which are available at first release. Certain types of 2 Gbit/sec SFPs are supported but will allow only 1 or 2 Gbit/sec operation.

Because the SilkWorm 4100 is the first 4 Gbit/sec Fibre Channel switch to market, and 4 Gbit/sec storage devices and HBAs are only available in limited sample quantities, some minor issues could occur with 4 Gbit/sec speed auto-negotiation. When issues arise with 4 Gbit/sec speed negotiation, it is possible to manually set the speed for a given port using the “portCfgSpeed” command. In other words, to fix port 19 to 4 Gbit/sec, simply run “portCfgSpeed 19 4” where the last argument is desired speed, with “0” being

5


auto-negotiate, “1” being 1 Gbit/sec, “2” being 2 Gbit/sec, and so on. It is also advised to log a call with support providers or directly with Brocade in the case of a speed negotiation issue.

Depending on which Brocade Partner provides the SilkWorm 4100, it will be delivered with one of two different rail kits. One option is “fixed” rails, which simply fixes the switch into the rack, and a second option is sliding rails that allow the SilkWorm 4100 to slide in and out of the cabinet.

SilkWorm 4100 SAN Management The SilkWorm 4100 is managed just like the SilkWorm 3900 and other Fabric OS 4.x 2 Gbit/sec

switches. Enhancements include support for 4 Gbit/sec and other new features that are specific to the 4 Gbit/sec Brocade product family. All interfaces such as the CLI, Web Tools, Fabric Watch, Advanced Performance Monitoring, Fabric Manager, and the Fabric Access Layer API have been enhanced to support this. The SilkWorm 4100 has the features that were available in previous products in addition to the new ones mentioned above. Management of the SAN has been simplified, with fewer ISLs required to build and maintain the SAN due to the increased per-port speed.

Fabric OS 4.4 Overview Fabric OS 4.4 is the latest version of Brocade firmware. It has several new features and also hardware

enablement for the SilkWorm 4100. Below is a summary of all the new features. For more detailed configuration information on each feature, please refer to the Fabric OS 4.4 Procedures Guide, Fabric OS 4.4 Command Reference Manual, or the “help” command in the CLI. For each feature below, it is noted whether the feature is supported for the SilkWorm 4100 only or across the entire product family. The following switches support Fabric OS 4.4: SilkWorm 3016 (IBM Blade Center switch), 3250 (8 ports), 3850 (16 ports), 3900 (32 ports), 4100, 12000, and 24000. These are all 2 Gbit/sec switches except the SilkWorm 4100.

• FICON, FICON cascading and FICON CUP (SilkWorm 3900, 4100, 12000, and 24000) • Scalability – support for fabrics up to 2560 ports, as long as all switches use Fabric OS 4.4 (all

products) • Usability improvements and enhancements to Web Tools, Advanced Performance Monitoring,

Fabric Watch (all products) • Dynamic Path Selection (SilkWorm 4100 only, as described in the appendix) • 8-way trunking (SilkWorm 4100 only, as described in the appendix) • Network boot using TFTP server (SilkWorm 4100 only) • Dual domain support for SilkWorm 24000 • Mixed blade support for SilkWorm 24000 • Trunking over distance (all products, including SilkWorm 3250 and 3850; 3250/3850 must run

Fabric OS 3.2 for trunking over distance support) • Security enhancements (all products):

o FC-SP (DH-CHAP) o SSL/HTTPS o RADIUS o SNMP v3 o Additional audit logging

6


o Non-disruptive “SecModeEnable” o Secure Fabric OS over gateway support o Support for up to 15 different user accounts o Secure Fabric OS scalability up to 2560-port fabrics

• FDMI enhancements such as FDMI hostname support (all products) • SNMP enhancements such as link up/down support and RAS errorlog (all products)

Summary With a cross-sectional internal bandwidth of a quarter of a terabit, the SilkWorm 4100 is the highest-

performing 32-port SAN switch in the industry by a wide margin. It is the optimal choice for organizations that want the benefits of a SAN switch at a low entry price point with the option to scale on a “pay-as-you-grow” basis at a low incremental cost. The SilkWorm 4100 can meet mission-critical test criteria and offers significant advantages as organizations develop and grow. The proven intelligence of the industry-leading SilkWorm product family gives organizations the investment protection of Brocade storage networking solutions and the industry’s largest and most developed partner ecosystem. With advanced flexibility, performance, and high-availability features, the SilkWorm 4100 can improve resource utilization and administrator productivity to lower the overall storage total cost of ownership.

Appendix A: 4 Gbit/sec Fibre Channel Line Rate With the SilkWorm 4100, Brocade introduced a native 4 Gbit/sec Fibre Channel interface speed. The

platform can support auto-negotiation between 1 and 2 Gbit/sec Fibre Channel on all ports for backward compatibility. This allows node connections at 4 Gbit/sec as well as higher speeds and lower costs for ISLs.

Like 2 Gbit/sec Fibre Channel, native 4 Gbit/sec was defined by the FC-PH-2 standard in 1996. Despite the fact that the 4 Gbit/sec standard was ratified at the same time as the 2 Gbit/sec standard, no 4 Gbit/sec products were built until 2004. There was a debate in the Fibre Channel industry for years about whether or not to build 4 Gbit/sec products at all, or to go straight to 10 Gbit/sec. The debate ended in 2003 when the Fibre Channel Industry Association (FCIA) voted to adopt 4 Gbit/sec, and all major Fibre Channel vendors began to add 4 Gbit/sec products to their roadmaps. In fact, both Brocade and the majority of other FCIA members had been positioned on the 10 Gbit/sec side of the issue prior to 2003. The factors that motivated both Brocade and the industry as a whole to change direction included both technological and economic trends.

Technical Drivers for Native 4 Gbit/sec Fibre Channel Two questions in the 4 Gbit/sec versus 10 Gbit/sec debate were whether higher than 2 Gbit/sec speeds

were needed at all, and if so, which of the candidates could be deployed in the most practical way.

Higher speeds were needed for several reasons. Some hosts and storage devices, such as large tape libraries, were running fast enough to saturate 2 Gbit/sec interfaces. In some cases, this was causing a business-level impact: if a backup device could stream data faster, backup windows could be reduced and/or fewer tape devices would be required. Furthermore, running faster ISLs would mean needing fewer of them, thus saving cost on switches and cabling. For long-distance applications running over xWDM or dark fiber, the reduction in the number of links could have a substantial ongoing cost savings.

7


For these and many other reasons, the industry acknowledged that 2 Gbit/sec speeds were no longer sufficient for storage networks. The choice was whether to use 4 or 10 Gbit/sec as the primary strategy. It turned out that 4 Gbit/sec had substantial technical advantages related to deployment, and provided at least the same performance benefits as 10 Gbit/sec.

Hosts and storage devices that were exceeding the 2 Gbit/sec interface capacity were not doing so by a large amount. Some tape drives were designed to stream at between 3 and 4 Gbit/sec, and some hosts could match these speeds. However, only a small number of the highest-end systems in the world could exceed 4 Gbit/sec, and even these could not generally sustain 10 Gbit/sec streams. Perhaps the biggest barrier to widespread deployment of 10 Gbit/sec was its innate incompatibility with existing infrastructure. It required different optical cables, used different media, and was not backward compatible with 1 or 2 Gbit/sec. Needing to rip and replace all HBAs and storage controllers at once, not to mention an entire data center cable plant, would not only be prohibitively expensive but operationally impossible in the “always-on” data centers that power today’s global businesses.

It became clear because of these factors that the optimal speed for nodes would be 4 Gbit/sec. However, there was still a case to be made for ISLs at 10 Gbit/sec.

Replacing the optical infrastructure would be less of a technical issue with backbone connections, because there are always fewer of them than there are node connections. Additionally, some high-end installations require that switch-to-switch connections run faster than 4 Gbit/sec. Indeed, some networks require backbones to run at far higher than 10 Gbit/sec speeds. No matter how fast an individual interface can be made, there always seems to be an application that needs more bandwidth. Brocade decided to solve this with trunking for 4 Gbit/sec interfaces, giving 4 Gbit/sec networks performance superiority over 10 Gbit/sec while still lowering costs and simplifying deployments. (Brocade can create a single 256 Gbit/sec balanced path using 4 Gbit/sec trunking plus DPS.)

Another technical factor that had to be considered was network redundancy. Most organizations configure links in pairs, so that there will be no outage if one link should fail. With a single 10 Gbit/sec link, any component failure will result in an outage, which means that the minimum realistic configuration between two switches is 20 Gbit/sec (two 10 Gbit/sec links). Relatively few applications require so much bandwidth between each pair of switches. Given the cost of 10 Gbit/sec interfaces, redundancy would be harder to cost-justify, and from a network engineering point of view, any factor that motivates organizations towards non-redundant deployments is a negative market force.

To fully appreciate this, consider the performance parity case. If three 4 Gbit/sec links are configured, and one fails, the channel is 33 percent degraded. For a network with the exact same performance requirement, a single 10 Gbit/sec link is needed, which is more expensive than the three 4 Gbit/sec interfaces combined, and requires more expensive single-mode optical infrastructure. If that link fails, the network has an outage, because 100 percent of bandwidth is lost. This requires a second 10 Gbit/sec link to be provisioned, with all the associated cost and deployment complexity, even though the additional performance is not required. If a 10 Gbit/sec proponent were to argue that two times the performance were really needed, the 4 Gbit/sec proponent could configure six 4 Gbit/sec links, which would still cost far less, have higher availability, and perform identically. In fact, the larger the configuration and the higher the performance requirements, the greater the advantages of 4 Gbit/sec from both a performance and availability point of view.

All of this adds up to substantial technical advantages for 4 Gbit/sec above 10 Gbit/sec for the vast majority of deployment cases. Until mainstream nodes can saturate 4 Gbit/sec channels, 4 Gbit/sec Fibre Channel is likely to remain the mainstream interface speed for storage networks.

8


Economic Drivers for Native 4 Gbit/sec Fibre Channel 4 Gbit/sec Fibre Channel provides a higher Return On Investment (ROI) than 10Gbit due to its lower

acquisition cost and its ability to fit into existing infrastructure.

4 Gbit/sec interfaces use the same low-level technology and standards as 1 and 2 Gbit/sec across the board: the encoding format is just one example. One way to think of a 4 Gbit/sec switch is that it is like running a 2 Gbit/sec switch with a higher clock rate. The net result is that 4 Gbit/sec products can be marketed at about the same price as existing products. 10 Gbit/sec, on the other hand, is fundamentally different: it uses technology that requires different components, which are all produced in much lower volume. This is true to such an extent that current price projections indicate that three 4 Gbit/sec links will cost less than one 10 Gbit/sec link, so even deploying equal bandwidth is more economical with 4 Gbit/sec.

Not only were 10 Gbit/sec interfaces projected to be more expensive, but the optical infrastructure organizations already installed for 1 and 2 Gbit/sec would not work with 10 Gbit/sec devices. 4 Gbit/sec, on the other hand, could use the existing cable plants, providing backward compatibility.

Since 4 Gbit/sec products cost less than 10 Gbit/sec products even at performance parity, and installation would be less expensive as well, the economic debate came out firmly on the side of 4 Gbit/sec, just as had the technical discussion.

Native 4 Gbit/sec Adoption Timeline At every point in the price/performance/redundancy/reliability map, 4 Gbit/sec is more desirable than

10 Gbit/sec. All major Fibre Channel vendors have 4 Gbit/sec on their roadmaps, including switch, router, HBA, and storage manufacturers. The FCIA has officially backed this movement, and it is expected that much of the Fibre Channel equipment shipping in 2005 will run at this speed.

During the early adoption timeframe for 4 Gbit/sec, 2 Gbit/sec native switches will still be in high-volume production. First, the 4 Gbit/sec technology will be available only in selected pizza box switches like the SilkWorm 4100. It is usual for director-class products to follow behind switches since modular platforms are by nature harder to engineer, test, and market. During this period, 4 Gbit/sec switches will likely be deployed in standalone configurations, as the cores and/or edges of small to medium core-to-edge networks, and as edge switches in larger SANs.

Once 4 Gbit/sec blades begin to ship, 2 Gbit/sec directors at the edge of fabrics will likely have all net-new blades purchased with 4 Gbit/sec chips. There is probably no real incentive for most organizations to throw out their existing 2 Gbit/sec blades, so it is likely that 4 Gbit/sec ports will simply reside along the existing 2 Gbit/sec interfaces within existing chassis.1 The new 4 Gbit/sec blades will likely replace 2 Gbit/sec ISLs going to the core. Directors at the core of large SANs will either have their blades upgraded (4 Gbit/sec blades purchased and old blades transferred to edge chassis) or in some cases the entire core chassis might be migrated to the edges of a fabric.

1 Brocade intends to offer 4 Gbit/sec blades that can coexist with SilkWorm 24000 2 Gbit/sec blades in the same chassis, but at least two other vendors require forklift chassis upgrades. Be sure to ask if a 2 Gbit/sec chassis purchased today will support 4 and 10 Gbit/sec blades in the future, and if these can coexist with existing blades.

9


The time lag between edge switches and directors is not considered to be a problem: the industry does not believe that 2 Gbit/sec is by any means obsolete. Most organizations do not immediately require 4 Gbit/sec interfaces, and many will be able to use their 2 Gbit/sec switches for years to come.

Some time after the first 4 Gbit/sec switches ship, node vendors will start to come out with 4 Gbit/sec interfaces. Most organizations will not have an immediate need for 4 Gbit/sec HBAs, for example, so it is likely that only net-new installations will use this speed. (This is one reason backward compatibility with 1 and 2 Gbit/sec was so important.)

By the end of 2005, it is expected that all major vendors will ship 4 Gbit/sec interfaces by default on products in every segment, and that the vast majority of new deployments will use this speed almost exclusively.

Appendix B: Link Balancing and Aggregation Even in over-provisioned networks, there might be “hot spots” of congestion, with some paths running

at their limit while others are unused. In other words, the network might be a performance bottleneck even if it has sufficient bandwidth to deliver all data flows without constraint. This happens when a network does not have the intelligence to balance loads across all available paths. The unused paths might still be of some value for redundancy, but not for performance. Brocade has three options for supporting more evenly balanced networks.

All Brocade platforms support source-port route balancing via FSPF. This is known as Dynamic Load Sharing (DLS). In addition to DLS, all 2 Gbit/sec Brocade fabric switches and directors support frame-level trunking between ASICs. The size of a trunk group can be four or eight links per trunk, depending on the ASIC architecture. This is known as “Advanced ISL Trunking.” The SilkWorm 4100 and the SilkWorm Multiprotocol Router both support exchange-level trunking. This is also known as Dynamic Path Selection (DPS). In switches with hardware support for the latter two forms of balancing, an optional license key is required.

All three options can increase availability as well as performance. If multiple links are configured, a link failure will be dealt with automatically by the network instead of causing a failover event on all hosts simultaneously. Performance might be degraded while the problem is fixed, but the overall network path will still be available. (The availability impact is even more apparent for hosts that do not use multipathing software.) A similar effect happens when adding a link: with frame-level trunking, links can be added dynamically without disrupting existing I/O streams. Without an advanced link balancing mechanism, either the new link(s) would be unused, or I/O would be disrupted.

Dynamic Load Sharing: FSPF Route Balancing All Fibre Channel fabrics support the FSPF protocol. It is part of the base operating system as long as

fabric and E_Port functions are present. FSPF calculates the topology of a fabric and determines the path cost between endpoints. In many network topologies, such as the popular resilient core-to-edge design, there will be more than one equal-cost path between a source and any given destination edge switch. Which path to use can be controlled on a per-port basis from the source switch. By default, FSPF will attempt to spread connections from different ports across available paths at the source-port level. Brocade switches have an option that allows FSPF to reallocate routes whenever in-order delivery can still be assured. This might happen when a fabric rebuild occurs, when device cables are moved, or when ports are brought online after being disabled. This feature is called Dynamic Load Sharing (DLS).

10


DLS does a “best-effort” job of distributing I/O by balancing source port routes. However, some ports might still carry more traffic than others, and DLS cannot predict which routes will be “hot” when it sets up routes, since they must be allocated before I/O commences. Also, traffic patterns tend to change over time, so no matter how routes were distributed initially, it would still be possible to have hot spots appear later, and changing the route allocation randomly at runtime would cause out-of-order delivery.2 Balancing the number of routes allocated to a given path is not the same as balancing I/O, and so DLS does not do a perfectly even job of balancing traffic.

The DLS feature is useful, and since it is free and works automatically, some form of it is used in virtually all Brocade multi-switch fabrics. However, DLS does not solve all performance problems, so there is a need for more evenly balanced methods. Such methods require hardware support, since path selection needs to be done on a frame-by-frame basis, and doing this in software would have a disastrous performance impact. The trunking methods described below fill this need.

Advanced Trunking: Frame-Level Load Balancing Frame-Level Trunking Implementation

Trunking allows traffic to be evenly balanced across ISLs while preserving in-order delivery. Brocade supports two forms of trunking: frame-level and exchange-level trunking. The frame-level method balances I/O such that each successive frame can go down a different physical ISL, and the receiving switch ensures that the frames are forwarded onward in their original order. Figure 1 shows a frame-level trunk between two SilkWorm 3850 edge switches.

For this to work, there must be high intelligence in both the transmitting and receiving switches.

At the software level, switches must be able to auto-detect that forming a trunk group is possible, program the group into hardware, display and manage the group of links as a single logical entity, calculate the optimal link costs, and optimally manage low-level parameters like buffer-to-buffer credits and virtual channels. Management software must represent the trunk group properly. For the trunking feature to have broad appeal, this must be as user-transparent as possible.

At the hardware level, the switches on both sides of the trunk must be able to handle the division and reassembly of several multi-gigabit I/O streams at wire speed without dropping a single frame or delivering even one frame out of order. To add to this challenge, there are often differences in cable length between different ISLs. Within a trunk group, this creates a skew between the amount of time each link takes to deliver frames. This means that the receiving ASIC will almost always receive frames out of order, and therefore must be able to calculate and compensate for the skew to reorder the stream properly.3

2 One Fibre Channel switch vendor attempted to use a similar method to change routes while I/O was in flight. As it quickly discovered, this caused massive degrees of out-of-order delivery: something that violates Fibre Channel standards and, as a practical concern, breaks many applications. 3 There are limitations to the amount of skew than an ASIC can tolerate. These are high enough limits that they do not generally apply. The real-world applicability of the limitation is that it is not possible to configure one link in a trunk to go clockwise around a large dark fiber ring while another link goes counterclockwise. As long as the differences in cable length are measured in a few tens of meters or less, there will not be an issue. However if the differences are larger, a trunk group cannot form. Instead, the switch would create two separate ISLs, and use either DLS or DPS to balance them.

11


Figure 1. Frame-Level Trunking Concept

Frame-Level Trunking Advantages The main advantage of frame-level trunking is that it provides optimal performance: a trunk group

using this method truly aggregates the bandwidth of its members. The feature also increases availability by allowing non-disruptive addition of members to a trunk group and minimizing the impact of failures.

Frame-Level Trunking Limitations On Brocade 2 Gbit/sec Fibre Channel switches,4 multiple groups of two to four 2 Gbit/sec links each

can be combined into balanced pipes of 4 to 8 Gbit/sec each. Figure 1 shows a simple case of this between two SilkWorm 3850 switches. Figure 2 shows another trunked configuration, this time between two SilkWorm 3850 edge switches and two SilkWorm 24000 directors using two 2-port trunk groups each.

4 Such as the SilkWorm 3200, 3250, 3600, 3800, 3850, 3900, 12000, and current 24000 blades.

12


Figure 2. Frame Trunking plus DLS

On newer switches such as the SilkWorm 4100 and forthcoming SilkWorm 24000 blades, it is possible to configure multiple groups of up to eight 4 Gbit/sec links each. The effect is just like Figure 1, except the performance per-trunk is quadrupled: instead of forming multiple 8 Gbit/sec pipes, the SilkWorm 4100 can create balanced 32 Gbit/sec pipes (64 Gbit/sec full-duplex). In connecting a SilkWorm 4100 or other future 4 Gbit/sec switches to a 2 Gbit/sec switch, a “lowest common denominator” approach is used, meaning that the trunk groups will be limited to four by 2 Gbit/sec instead of eight by 4 Gbit/sec.

As mentioned previously, frame-level trunking requires that all ports in a given trunk must reside within an ASIC port group on each end of the link. While a frame-level trunk group will outperform either DLS or DPS solutions every time, using links only within port groups limits configuration options. The solution is to combine frame-level trunking with one of the other methods, as shown in Figure 2. This shows frame-level trunking operating within port groups and DLS operating between trunks. Even though the ISLs are all within a trunk group on the edge switches, the four links could not form a trunk because they go to different port groups on different cores. Changing that would affect availability.

On 2 Gbit/sec switches, port groups are built on contiguous 4-port groups called quads. On a SilkWorm 3250, for example, there are two quads: ports 0–3 and ports 4–7. On 4 Gbit/sec switches like the SilkWorm 4100, trunking port groups are built on contiguous 8-port groups called octets. In that product, there are four octets: ports 0–7, 8–15, 16–23, and 24–31.

It is also possible to configure multiple trunks within a port group. For example, on the SilkWorm 3850, it is possible to configure one trunk on ports 12–13 and a separate trunk on ports 14-15. These trunks could go to different core switches in a core-to-edge network or to different blades on a director. Figure 2 illustrates this case as well.

It is also important to understand how frame-level trunking interacts with Brocade Extended Fabrics software.

13


Each ASIC has a certain number of buffer-to-buffer credits. On 2 Gbit/sec switches, these are sufficient to operate all four ports in a quad in any mode over short distances.5 Over longer distances, it is necessary to limit the functions of some of the ports. For example, all 2 Gbit/sec switches can support a 4-port trunk at 10 km, but at 50 km only a 2-port trunk is supported with the other two ports limited to node attachment.6 At 100 km, only one long-distance port can be configured, which precludes trunking. It is possible to configure multiple 100 km links on the SilkWorm 3016, 3250, 3850, and 24000: up to one per quad. However, since these cannot be in the same port group, it is necessary to use DLS to balance them, not trunking.

4 Gbit/sec switches such as the SilkWorm 4100 have more flexible support for trunking over distance. In the 4 Gbit/sec switches, buffers are shared across the entire chip, not limited by quads or octets. It is possible, for example, to configure up to 8-port by 4 Gbit/sec trunks at 50 km (32 Gbit/sec trunk group) or 4-port by 4 Gbit/sec trunks at 100 km (16 Gbit/sec trunk group). In some cases, it might be desirable to configure trunks using 2 Gbit/sec links. For example, the trunk group might cross a DWDM that does not have 4 Gbit/sec support. In this case, an 8-port 2 Gbit/sec trunk can span up to 100 km.

Dynamic Path Selection: Exchange-Level Trunking Dynamic Path Selection (DPS) is a new trunking method that, at the time of this writing, is available

on in the SilkWorm 4100 and SilkWorm Multiprotocol Router. Brocade intends to release additional platforms with this feature in the future.

The DPS method works by striping Fibre Channel exchanges across equal-cost paths. An exchange identifier is placed by the sender into every Fibre Channel frame header. In normal operation, the exchange ID remains consistent for the duration of a SCSI operation. When a DPS-enabled platform receives a frame, it takes all equal-cost routes and calculates the egress port from that set based on a formula using the Sender PID (SID), Destination PID (DID), and Exchange ID (OXID). The formula will always select the same path for a given SID, DID, OXID set.

Effectively, DPS stripes I/O at the SCSI level.7 For a given “conversation” between a host and storage port, one SCSI command would go down the first path, and the next command would go down a different path. All frames within a given exchange would be delivered in order by virtue of going down the same network path. The potential exists for out-of-order delivery between different SCSI operations, but all devices tested to date are capable of handling this gracefully. Consider it this way: if two different hosts are writing to two different storage ports across the same network, in-order delivery between the different hosts is not important. It only matters that in-order delivery occurs within the data stream sent by each host, not between the two different and unrelated streams.

This causes subtle differences in performance versus frame-level trunking. The result is that exchange-level trunking matches or outperforms all similar features from any vendor except for the Brocade frame-level trunking feature. Because DLS can be combined with frame-level trunking, it is possible to achieve both maximum performance and availability, as shown in the following section.

5 “Short” can be about 25 km without performance degradation and longer if full 2 Gbit/sec throughput per port is not required. 6 For long-distance trunking, certain minimum Fabric OS firmware versions might be required. 7 “Effectively” is an important word. This does happen almost all the time, but there could be some Fibre Channel devices that do not map SCSI operations to exchange IDs, and other protocols like FICON and IP/FC do not necessarily behave this way either.

14


Exchange-Level Trunking Advantages While performance might be slightly lower in some cases, there are also several advantages to the

DPS method.

For one thing, exchange-level trunking does not need to occur within ASIC port groups the way frame-level trunking must be configured. This allows load balancing across different core switches in a core-to-edge network or different blades in a director, rather than mere DLS route balancing. Figure 3 shows an example of this.

Figure 3. Frame Trunking plus DLS

In addition, DPS is not exclusive of frame-level trunking. It is possible to balance several groups of ports using the frame-level method, and then balance between the resulting trunk groups using the exchange-level method. This provides the optimal balance of performance (frame-level is faster) and availability (exchange-level provides the flexibility to balance high-availability network topologies.) This is also illustrated in Figure 3.

Next, DPS can balance I/O sent from an enabled platform to any other platform even if the destination does not support the feature. Path selection is made by the transmitting switch, and the receiver does not need to do anything special to ensure in-order delivery. This allows full backward compatibility with existing switches and provides some performance benefit even if not all switches in a fabric are using the latest technology. This is shown in Figure 4.

15


Figure 4. DPS in a Mixed Fabric

Finally, DPS can balance I/O across long-distance configurations not supported by frame-level trunking. For example, if there are two links configured between two sites that take substantially different paths, there might be too much skew to form a frame-level trunk. DPS would still be able to balance these links, since it does not rely on de-skew timers for in-order delivery. Figure 5 shows an example of this.

Figure 5. DPS with Large Fiber Ring

Link Balancing Summary All networks with more than one ISL can benefit from link balancing. The most effective high-

performance/availability designs combine two forms of balancing: frame-level trunking for best performance, and DLS or DPS to balance multiple trunk groups for the highest availability. It is axiomatic that network utilization will always grow to meet network capacity: no matter what bandwidth is initially configured, eventually more will be needed. Therefore, even if a SAN does not require trunking for

16


performance right away, a flexible design will facilitate a high-performance implementation when needed in the future.

Appendix C: Distance Extension History: With Brocade 2 Gbit/sec switches, a total of 108 buffers are available per quad. By default, 16 buffer

credits are assigned to any F_ or FL_Port, and 26 are assigned to local (not long-distance) E_Ports. The extra buffers are available to all ports in the quad as a pool. Any node that reaches 100 percent utilization would automatically go to the pool for additional buffer credits before releasing its dedicated credits. The pool is also used to support long-distance ISLs.

The SilkWorm 4100 has a total of 1024 buffer credits, which are shared among the 32 ports. Twenty-four of these credits are used for the embedded port, and the rest are available for user consumption. With the SilkWorm 4100, F_ and FL_Ports receive eight buffer credits by default, and local E_Ports (L0 mode) receive 26 buffer credits. (This is the same amount of credit for both 2 and 4 Gbit/sec ports.) A minimum of eight buffer credits is reserved for each port so no ports are ever starved of credits. The rest is available in a buffer pool, which can be configured for use by any of the 32 ports. With the SilkWorm 4100, buffers are no longer automatically assigned out of the pool as in previous products, because line-speed can easily be achieved with eight buffer credits for a local device. Assigning more credits to a node can be accomplished with the “portCfgLongDistance” command as long as the Extended Fabrics license is installed. Long-distance links are no longer automatically assigned additional credits unless long-distance mode is configured for E_Ports.

A single port can be assigned up to 255 buffer credits, which equates to approximately500 km at 1 Gbit/sec, approximately 250 km at 2 Gbit/sec, and approximately 125 km at 4 Gbit/sec. The table below illustrates the different configurations over distance supported by the SilkWorm 4100, including frame-trunked links over distance. To achieve the longer distances, it is necessary to have a SilkWorm 4100 (or future 4 Gbit/sec blades for the SilkWorm 24000) on both ends of the long-distance link.

Table 1. Extended Trunking Distances and Data Transfer Speeds

4 Gbit/sec 2 Gbit/sec 1 Gbit/sec

Ports Trunked

Throughput Distance Throughput Distance Throughput Distance

8 32 Gbit/sec 30 km 16 Gbit/sec 60 km NA NA 4 16 Gbit/sec 60 km 8 Gbit/sec 125 km NA NA 2 8 Gbit/sec 125 km 4 Gbit/sec 175 km NA NA 1 4 Gbit/sec 125 km 2 Gbit/sec 250 km 1 Gbit/sec 500 km

© 2005 Brocade Communications Systems, Inc. All Rights Reserved.

Brocade, the Brocade B weave logo, Fabric OS, Secure Fabric OS OS, and SilkWorm are registered trademarks of Brocade Communications Systems, Inc., in the United States and other countries. FICON is a registered trademark of IBM Corporation in the U.S. and other countries. All other brands, products, or service names are or may be trademarks or service marks of, and are used to identify, products or services of their respective owners.

Important Notice: Use of this paper constitutes consent to the following conditions. This document is supplied “AS IS” for informational purposes only, without warranty of any kind, expressed or implied, concerning any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes to this document at any time, without notice, and assumes no responsibility for its use. This informational document describes features that may not be currently available. Contact a Brocade sales office for information on feature and product availability. Export of technical data contained in this book may require an export license from the United States government.

17

brocade silkworm 4100 design guide

Documents