Data Center Fabric Architectures (Descriptions)

Data Center Fabric Architectures

The Data Center Fabric term is as meaningless as switching or cloud – every major networking vendor is announcing or selling data center fabric solution and no two vendors have something remotely similar in mind. Even worse, all fabric architectures announced so far are proprietary.

To choose the best solution for your data center, you must look beyond white papers and marketectures and figure out what’s really going on behind the scenes. The in-depth understanding of how different fabric architectures work will help you identify their benefits, drawbacks and potential pitfalls.

This article describes five classes of fabric architectures based on how they use management, control and data (forwarding) plane. Throughout the article we’ll use the generic term switch to describe a forwarding device that can forward either Ethernet frames (layer 2 switch) or IP datagrams (layer 3 switch).

Fabric architectures are analyzed based on behavior between physical devices. Every chassis switch usually has a single active control plane and multiple data planes (one per linecard), making it an in-chassis Borg architecture. Regardless of the in-chassis implementation details, a fabric architecture should be categorized as Borg architecture only when a single control plane entity controls data planes in multiple physical devices.

Contents

The Principles

Every forwarding device has three “planes”:

Management plane interacts with the network operator (usually via CLI or web interface), network management systems (usually via SNMP) and other external entities (sometimes using NETCONF or other XML- or REST-based API). Management plane software configures the device and keeps track of current and saved device configuration.

Control plane runs control protocols (LACP, LLDP, CDP, STP, ARP/ND, OSPF, BGP ...), exchanges network topology and reachability information with adjacent devices and uses the exchanged information (or information gleaned from the forwarded packets) to build forwarding tables, which are used to forward the frames/packets traversing the device.

Data (forwarding) plane uses the forwarding tables to forward layer-2 frames or layer-3 datagrams. The data plane might send packets it cannot process to the control plane (example: IP datagrams violating access lists) or extract information from forwarded packets and send that information to the control plane (dynamic MAC address learning).

More information: Control and Data Plane and Protecting the Router’s Control Plane articles.

Impact of Multi-Chassis Link Aggregation

Link Aggregation Group (LAG, defined in 802.3ad and 802.1AX) is a mechanism that allows Ethernet devices to use parallel links deployed for redundancy or load balancing purposes in an active-active configuration (without LAG, STP would disable some of the links). Redundant uplinks are commonly used for server attachments; multiple links between adjacent network devices are often used to increase the available bandwidth.

The 802.1AX standard defines link bundles between two adjacent devices. All networking vendors have implemented proprietary extensions to support multi-chassis link aggregation (MLAG).

Every fabric architecture should support MLAG. Architectures using a single control plane support MLAG by definition; all other architectures have to use proprietary mechanisms between devices terminating a MLAG bundle.

More information

Business as Usual

Summary: Each device has its own independent management, control and data planes.

Each switch operates independently and remains a separate management and configuration entity. This approach has been used for decades in building the global Internet and thus has proven scalability. It also has well-known drawbacks (large number of managed devices) and usually requires thorough design to scale well.

A business-as-usual approach using Spanning Tree Protocol (STP) does not work well with large layer-2 domains commonly deployed in virtualized data centers. A data center fabric using Business as Usual architecture has to replace STP with a more scalable alternative. TRILL and SPB (802.1aq) are the standard candidates; Cisco’s FabricPath and Brocade’s VCS are proprietary alternatives.

As long as access-layer switches are not TRILL/SPB-enabled, the data center fabric must support multi-chassis link aggregation (MLAG) to optimize bandwidth utilization.

Examples:

  • Cisco’s Nexus 5000 and Nexus 7000 switches
  • Brocade’s VCS fabric
  • Force 10 Z-series switches will support TRILL with a software upgrade

More information:

The Borg

Summary: Devices in a Borg fabric have a single (usually redundant) management and control plane. Each device has independent data plane managed by the single centralized control plane.

In the Borg architecture (sometimes known as stacking on steroids) numerous switches form a cluster collective and elect a single control plane (or outsource the control plane functions to an external device) that controls the whole cluster. The cluster of devices appears as a single control- and management-plane entity to the outside world. It’s managed as a single device, has a single configuration, single instance of STP and one set of routing adjacencies with the outside world.

Examples: stackable switches, Juniper’s virtual chassis, HP’s IRF, Cisco’s VSS

Like the original Borg, the switch cluster architectures cannot cope well with splits from the central brains. Cisco’s VSS reloads the primary switch when it detects a split brain scenario; HP’s IRF and Juniper’s virtual chassis disable the switches that lose cluster quorum.

While vendors like to talk about all-encompassing fabrics, the current implementations usually limit the number of high-end devices in the cluster to two (Cisco’s VSS, Juniper’s EX8200+XRE200 and HP’s IRF), reducing the Borg architecture to a Siamese twin one.

Furthermore, most implementations of the Borg architecture still limit the switch clusters to devices of the same type (exception: you can build mixed-model virtual chassis with EX4200 and EX4500 switches from Juniper).

As you cannot combine access- and core-layer switches into the same fabric, you still need MLAG between the access and the core layer.

At the moment, all Borg-like implementations are proprietary.

More information:

The Big Brother

Also known as controller-based fabric, this architecture uses dumb(er) switches that perform packet forwarding based on instructions downloaded from the central controller(s). The instructions might be control-plane driven (L3 routing tables downloaded into the switches) or data-plane driven (5-tuples downloaded into the switches to enable per-flow forwarding).

The controller-based approach is ideal for protocol- and architecture prototyping (which is the primary use case for OpenFlow) and architectures with hub-and-spoke traffic flow (wireless controllers), but has yet to be seen to scale in large any-to-any networks.

Hybrid Architectures

Some implementations appear to be using one of the architectures from the outside but actually use a different architecture internally. For example, the stackable switches from Juniper use VCCP (an IS-IS-like protocol) internally to distribute MAC address reachability information.

A stack of Juniper switches is thus a Quilt (each device has its own control plane; the whole stack has a single management plane), but appears as a Borg from the outside (single STP and LACP instance).

Revision History

2011-03-11 Original text published in @ioshints blog
2011-05-24 Document migrated to www.ioshints.info

Added Principles section and numerous links to in-depth articles
Added Hybrid Architectures

2011-05-25 Juniper supports mixed-model virtual chassis with EX4200 / EX4500 switches