Chat Live

Welcome

You are signed as:

My Profile
Logout

Please Confirm

Confirm your country to access relevant pricing, special offers, events, and contact information.

Download

Optimizing All-to-All Collective Communications for AI-ML Workloads

Application Notes

GPUs are costly. Keeping them idle means dollars going in drain. In AI-ML computation, the fabric connecting the servers needs to be efficient - high throughput, low latency, and loss less. So that network does not become the cause of GPU idleness. Our solution provides various measures to find the health of the network before deployment. There are multiple parameters to tune. This app note would be helpful to customer to know about bells and whistles of the solution. Also, guide them what to measure. So, that customer can build a proper test strategy using our solution.

This document emphasizes the focus on improving communication performance specifically for machine learning and high-performance computing tasks. Our RoCEv2 emulation in IxNetwork exposes various configuration parameters to modify the network - this app note would discuss them. Also, the measurements through stats would also be described. Some of the customer DUT parameters will be also described to get congestion controls.

What are you looking for?

I'm looking for support MXG Signal Generator UXA Signal Analyzer PXA Signal Analyzer Find a solution Get technical support Take a class Find us at events Premium used equipment KeysightCare Buy online

No product matches found - System Exception