Arista 7700R4 Distributed Etherlink Switch for Accelerated Computing

The Arista 7700R4 Distributed Etherlink Switch™ (DES) is an ultra-scalable, smart distributed system that builds on the foundations of the 7800R4 to deliver over 27,000 800GbE ports and Petabits per second of capacity combined with the perfect fairness and lossless packet delivery - imperative for supporting largest scale AI and ML applications built on any flavor of XPU.

The Ultra Ethernet ready DES is optimized to support the needs of all leading edge, large scale accelerated compute clusters which depend on a non-congested, non-blocking interconnect between accelerators for the efficient execution of thousands of simultaneous high bandwidth transactions. Congestion, packet loss and failure can dramatically reduce the efficiency of a workload or potentially lead to a stall in processing, all of which reduce the useful working time and return on investment (ROI) of high value compute farms.

The unique scalability and fully coordinated logical single-hop behavior is made possible by an automated and fully hardware accelerated architectural framework that implements massively parallel distributed end-to-end scheduling combined with deterministic traffic spraying across all available leaf-spine interfaces to deliver 100% efficient path utilization and equality to all traffic flows with no tuning required - critical attributes for supporting both single large workloads as well as mixed multi-tenant and multi-generational jobs in parallel.

Advanced Arista Etherlink® features for large scale accelerated compute clusters introduce workload visibility, advanced load balancing and telemetry in addition to the rich feature set provided by EOS. The 7700R4 DES is designed to be forwards compatible with Ultra Ethernet.

Arista AI Etherlink Portfolio

 
 

Arista 7700R4 Series Platforms:

The 7700R4 DES system consists of dedicated leaf and spine systems which are combined to create a large distributed switch. As a distributed system, DES is designed for pay-as-you-grow scaling starting from hundreds of ports, with a predictable, linear CapEx and OpEx trajectory to maximum capacity.

Platform Features
7700R4 AI Capable Network Switch

7700R4C-38PE
  • DES Distributed Leaf Switch
  • Accelerated compute optimized pipeline
  • 18 x 800GbE (36 x 400GbE) OSFP800 host ports
  • 20 x 800GbE (40 x 400GbE) fabric ports
  • 14.4Tbps of wirespeed performance with 16GB of buffers
7700R4 AI Capable Network Switch
7720R4-128PE
  • DES Distributed Spine Switch
  • Accelerated compute optimized pipeline
  • 128 x 800GbE (256 x 400GbE) fabric ports
  • 102.4Tbps of wirespeed performance

Flexible Solutions with 800G and 400G

  • Up to 18 ports of 800G (36 ports of 400G) per leaf switch
  • Up to 22 petabits per second per cluster
  • Up to 8 trillion packets per second per cluster
  • Wire speed L3 forwarding
  • Latency from under 4 microseconds
 

Optimized for Scale Out Accelerated Compute

  • Support for Linear-drive Pluggable Optics (LPO)
  • Field serviceable supervisor module
  • 1+1 redundant & hot-swappable power
  • N+1 redundant & hot-swappable fans
  • Over 96% efficient power supplies
  • Tool less rails for simple installation
  • Live software patching for zero downtime maintenance
  • Self healing software with Stateful Fault Repair (SFR)
  • Self-configuring fabric with hardware health checking and smart routing

Architected to maximize AI/ML performance

  • 100% Efficient Traffic Spraying Fabric
  • Virtual Output Queued (VOQ) ingress buffering
  • Distributed fabric scheduling
  • Integrated overprovisioning of fabric capacity
  • AI Analyzer powered by AVA
  • Advanced collective load balancing and congestion management
  • AI workflow integration
  • Designed for future UEC deployments
 

Advanced Traffic Control, Provisioning and Monitoring

 

Arista 7700R4 Distributed Etherlink Switch Technical Specifications

The 7700R4 Distributed Etherlink Switch series builds on seven previous generations of Arista 7500 and 7800 series systems, leveraging the fundamental system architecture that is proven by hundreds of thousands of systems in the world’s largest and most critical networks and is acknowledged as the most efficient solution for scaling back-end networks for accelerated computing. Where the 7800 Series offers modularity within a single system, the 7700R series enhances scaling beyond a single physical system, enabling scalability to thousands of ports.

Separating the traditional line card and fabric functions into discrete devices enables higher scalability through substantially larger topologies than is possible in a single physical chassis while also distributing power consumption, cooling needs and the control plane across multiple devices, and enabling hosts to connect to distributed leaves using cost effective, low power passive copper cabling.

The fully distributed control-plane runs locally on each system and combines with hardware based topology configuration, integrated end to end scheduling, distributed queueing and automatic link health monitoring to minimize deployment time and maximize performance and reliability in service.

7700R4 Benefits

The Arista 7700R Series is optimized to deliver massively parallel end to end scheduling and coordination - even in the largest topologies, the platform operates like a single extremely large switching chip that is 100% internally lossless and fair.  Four advanced technologies converge to this highly desirable architecture:

  • Traffic spraying fabric - elegantly mitigates elephant and mice flows. Uniform traffic spraying across all links for 100% efficiency
  • Virtual Output Queues (VOQ) - invokes ingress virtual queues to every egress port, eliminating head of line blocking (HOLB)
  • Distributed Credit Scheduling - ensures all egress ports are independent eliminating HOLB and noisy neighbors
  • Deep Buffering - easily handles in-cast, bursts and speed mismatches without packet loss, keeping TCP running efficiently

Deployment of the 7700R4 Distributed Etherlink Switch follows the well understood and widely implemented, leaf-spine physical topology model. In a simple 2-tier configuration, DES scales to support over 4600 x 800GbE hosts (4x the size of the largest 7816R4 modular system), in a single logical system with single-hop logical forwarding. By introducing an orthogonal super-spine layer, DES can continue to expand to support tens of thousands of 400GbE or 800GbE connected accelerators.

DES’ innovative logical single-hop architecture, robust traffic management and inherently fair forwarding paradigm eliminates the complexity of workload-specific architectures and tuning, abstracting logical and physical topologies while enabling multiple unique workloads in parallel. In many cases, DES removes the need for unique, job specific, cabling plans, as every accelerator is logically connected to the same single distributed switch. This allows the use of low cost, short reach cables and optics, saving significant expenditure and power consumption on cross-facility optical interconnections and eliminating the need to re-cable for different workload topologies.

This advanced architecture enables the Arista 7700R4 to handle the most demanding cluster computing workloads with ease. Generative AI clusters, ML and HPC environments all benefit from the 7700R4’s ability to linearly scale bisectional bandwidth and logical radix while handling high bandwidth low entropy traffic and mixed traffic loads without increasing latency or introducing unnatural congestion.

7700R4 Specifications

  7700R4C-38PE
7700R4 AI Capable Datacenter Switch

720R4-128PE
7700R4 AI Capable Datacenter Switch
Role Distributed Leaf Distributed Spine
Host Ports 18 x 800G OSFP
Fabric Ports 20 x 800G OSFP 128 x 800G OSFP
Max 800GbE 18
Max 400GbE 36
Port Buffer 16 GB
Switching Capacity (FDX) 14.4 (28.8) Tbps 102.4 (204.8) Tbps
Forwarding Rate 5.4 Bpps
Latency (End to End) From under 4 usec
Rack Units 2 7
Airflow Front to Rear
AC and DC PSU Yes Yes

7700R4 Scalability

The 7700R4 Distributed Etherlink Switch can be deployed in either a 2-tier (leaf and spine) or 3-tier (leaf, spine and super spine). Two-tier deployments scale to over 4600 x 800G hosts, while 3-tier deployments can scale up to over 27,000 x 800G or over 31,000 x 400G hosts. Examples are shown in the table below:

Scaling Examples 1152 x 800G
Hosts
2304 x 800G
Hosts
4608 x 800G
Hosts
Over 4608 x 800G
Hosts
Number of Spine Nodes 10 20 40 Up to 800G
inc. Super Spine
Number of Leaf Nodes 64 128 256 Over 1500
Leaf-Spine Interconnect 16 Tbps
(1600G per spine)
16 Tbps
(800G per spine)
16 Tbps
(400G per spine)
16 Tbps
(400G per spine)