AI Data Center Test Platform

AI Data Center Test Platform

Accelerate design and deployment of AI network infrastructure​

Already own this product? Visit Technical Support

Highlights

The Keysight AI Data Center Test Platform offers the ability to:

  • Emulate high-scale AI workloads with measurable fidelity. Gain deep insights into collective communication performance.
  • Simplify the benchmarking process. Validate AI network fabric with pre-packaged benchmark applications, built through partnerships with key AI operators and AI infrastructure vendors.
  • Execute defined AI / ML behavioral models. Share between users and customers to help reproduce experiments.
  • Choose your test engine. Choose between AI workload emulation on Keysight hardware load appliances and software endpoints, or real AI accelerators to compare benchmarking results.

Enable Emerging Inflection in AI / ML

Key industry trends and challenges in the AI / ML industry include:

  • AI clusters will grow to 40K+ nodes in 2024​
  • Idle up to 50 % of time waiting for data exchange​
  • Innovation in AI networking requires new measurement and benchmarking tools​

​The Keysight AI Data Center Test Platform​ is an Industry-leading 800 / 400GE test solution with a track record of lossless fabric validation​. It is faster to deploy with deeper insights compared to benchmarking with GPU-based systems​ and delivers provable fidelity of AI traffic emulation​.

Accelerate AI Network Design

Define the future of AI / ML infrastructure. Unlock possibilities and shape tomorrow’s landscape.

Accelerate AI Network Design

Benchmark job completion time of AI collective communications

Navigate the complexities of AI workloads.

​Achieve precision in network performance measurements​

Make design decisions based on deeper AI communications insights.​

Flexible what-if scenarios

Optimize AI collective performance by experimenting with AI traffic patterns to fine-tune fabric configuration.

​Cost-effective high-density AI network testbeds​

Scale experiments with AresONE-M 800GE and AresONE-S 400GE AI traffic emulation.​​

Transform AI Infrastructure Benchmarking

The Keysight AI Data Center Test Platform helps transform AI infrastructure benchmarking with precision and speed, by:
  • Optimizing AI / ML system design with realistic emulation
    of high-scale AI workloads.​
  • Delivering insights into collective communications performance.​
  • Simplifying benchmarking and validation with pre-packaged methodologies delivered as applications.​​
  • Emulating Remote Direct Memory Access (RDMA) over Converged Ethernet v2 (RoCEv2) endpoints by using high-density AresONE traffic load appliances with hundreds of 400GE or 800GE ports.
AI Data Center Test Platform

Keysight Collective Communications Benchmark

Pre-packaged methodology co-developed with key AI operators

Collective Communications Benchmark
The Keysight Collective Communications Benchmark application is designed to run micro-benchmarking for typical AI communications algorithms on the user-provided AI network fabric.
  • Evaluate AI network fabric performance for common types of collective communications.​
  • Measure performance metrics, including job completion time, algorithm and bus bandwidth; calculate ideal % to quantify deviations from theoretical maximum performance.
  • Use AresONE hardware to measure and analyze Queue Pair (AI data flows) performance, to summarize results as percentiles with drill-down capabilities for further analysis.​
  • Assess RoCEv2 emulation fidelity by comparing AresONE hardware results with metrics collected on actual AI systems.

RoCEv2 Endpoints Emulation and Stateful Validation

Beyond emulation, pioneering precision in RoCEv2 validation

RoCEv2 Support in IxNetwork / AresONE-S​

IxNetwork / AresONE-S supports RoCEv2 transport protocol with Data Center Quantized Congestion Notification (DCQCN) congestion control and Priority Flow Control (PFC). It provides a scalable and cost-effective solution to validate data plane traffic management effectiveness in AI clusters, optimizing network fabric performance.

Speed and Scale

AresONE-S offers up to 16 x 400GE port capacity per device and can be combined into a multi-appliance configuration with 256+ ports in a single collective. Each port emulates an RoCEv2 endpoint and supports thousands of Queue Pairs with line rate traffic. This scale is crucial for reproducing network topologies of real AI clusters.

Traffic Flexibility

To match realism of AI workload patterns and reproduce issues at smaller setups, AresONE RoCEv2 capabilities cover a range of traffic patterns from in-cast, to partial mesh, to full all-to-all collectives in the first release. At the transport level, it supports sequences of RDMA verbs with configurable data sizes, burst rates, intervals, all combined with DCQCN and PFC rate control mechanisms.

Per Queue Pair DCQCN Flow Control

DCQCN per queue pair enables precise network congestion control with features like Explicit Congestion Notification (ECN) and rate control, optimizing data flow and network fabric responsiveness.

AI Test Hardware

Keysight's data center load modules deliver high density and performance Ethernet IP test solutions with the industry's first 1G, 10G, 25G, 40G, 50G, 100G, 400G, and 800G speeds.

Want help or have questions?