KAI Inference Builder Bundle with 10 Agents and up to 1000 Prompts per Second

The KAI Inference Builder Bundle includes 10 agents and up to 1000 prompts per second (1-year subscription, floating worldwide). The bundle is TAA Compliant.

prod_image
  • Form factor

    Software

  • License types

    Subscription

  • Performance Level

    1000 prompts per second, 10000 simulated users

Ready for a quote

Find out what's included and explore available upgrade options from Keysight.

Highlights

  • Emulate realistic AI client behavior at scale to validate entire AI inference infrastructures and stacks.
  • Choose different AI persona prompts driving pressure points at different stages of the AI inference pipeline.
  • Validate public cloud or private cloud deployed AI inference infrastructures with fully virtual or hardware base inference client emulation.
  • Scale to millions of emulated users with granular control on the generated prompts per second load for unmatched AI inference scale testing.
  • Get detailed inference statistics to gain actionable insights into potential bottlenecks, limits, and inefficiencies at various components of the AI inference pipeline:
    • GPU compute
    • HBM / VRAM memory systems
    • KV-cache and storage layers
    • PCIe and RDMA interconnects
    • Model engines and orchestrators
  • Correlate client-side metrics with the ingestion of inference engine level telemetry (for example., VLLM statistics), and system-level GPU telemetry (for example, DCGM data) in a single time-synchronized view:
    • Prompts ser second
    • Concurrent Users
    • Time to First Token (TTFT) — Max and percentiles (for example, P50, P90, P99)
    • Time to Last Token (TTLT) — Max and percentiles (for example, P50, P90, P99)
    • Tokens per second (input / output)
    • Cache Usage
    • Prefill and Decode Time
    • Tensor Core Usage
    • Scheduler State
    • GPU Power Usage