Skip to content

HPC-X ClusterKit IB Network Testing Guide

Abstract: This document shares experience using the ClusterKit tool (part of the NVIDIA HPC-X toolkit) to validate InfiniBand (IB) network performance. ClusterKit automates pairwise bandwidth and latency testing across cluster nodes, making it essential for acceptance testing.

1. Environment & Prerequisites

1.1 Reference Environment

Verified on the following baseline:

  • CPU: Intel Xeon Platinum 8358 (Ice Lake)
  • OS: CentOS 7.9 / RHEL 8 / Ubuntu 22.04
  • Driver: MLNX_OFED_LINUX (Version matching OS)

1.2 Prerequisites

Before starting:

  1. DNS/Hosts: Configure /etc/hosts on all nodes for hostname resolution.
  2. SSH Trust: Configure Passwordless SSH from the control node to all compute nodes.
  3. IB Status: Ensure all IB interfaces are Active (check via ibstat).

2. Software Acquisition & Deployment

2.1 Download HPC-X

Download the HPC-X toolkit matching your OS and CUDA version from the official site:

2.2 Installation

Assuming installation path is /opt/software/.

bash
# 1. Extract
cd /opt/software/
tar xvf hpcx-v2.13.1-gcc-MLNX_OFED_LINUX-5-redhat7-cuda11-gdrcopy2-nccl2.12-x86_64.tbz

# 2. Load Environment
# Enter directory
cd hpcx-v2.13.1-gcc-MLNX_OFED_LINUX-5-redhat7-cuda11-gdrcopy2-nccl2.12-x86_64

# Set Home
export HPCX_HOME=$PWD

# Initialize
source $HPCX_HOME/hpcx-init.sh
hpcx_load

Verification

Run which clusterkit.sh after hpcx_load to confirm the path is set correctly.

3. Test Execution

3.1 Identify Device Name

If servers have multiple IB cards, identify the target HCA device name.

bash
ibdev2netdev

Example Output:

text
mlx5_0 port 1 ==> ib0 (Up)
mlx5_1 port 1 ==> ib1 (Up)

Note the device ID, e.g., mlx5_0:1 (Device:Port).

3.2 Prepare Hostfile

Create a hostfile listing the hostnames of nodes participating in the test:

text
node01
node02
node03
node04

3.3 Run Command

Navigate to the bin directory and execute:

bash
cd $HPCX_HOME/clusterkit/bin

# Scenario: Test ib0 (mlx5_0:1), Unidirectional Bandwidth
./clusterkit.sh --hca_list "mlx5_0:1" --hostfile hostfile --unidirectional

3.4 Key Parameters

ParameterDescriptionExample
--hca_listSpecific HCA list"mlx5_0:1" or "mlx5_0:1,mlx5_1:1"
--unidirectionalRun unidirectional BW test (Default is bidirectional)-
--hostfilePath to host list file--hostfile ./hosts

4. Result Analysis

ClusterKit generates a timestamped directory (e.g., 20240112_161542) containing results.

4.1 Core Files

  • bandwidth.txt: Pairwise bandwidth matrix.
    • Check values against the theoretical line rate (e.g., ~200Gbps for HDR, ~400Gbps for NDR).
  • latency.txt: Pairwise latency matrix.
    • Typical IB latency is 1us ~ 3us depending on switch hops.

5. Troubleshooting

5.1 Even Number of Nodes Required

Error: must be run with an even number of nodes

ClusterKit performs Pairwise testing by default. If the hostfile contains an odd number of nodes (e.g., 3), it will fail.

Solution: Remove one node or add one node to make the count even.

5.2 Low Performance

  • Verify PCIe slot is x16 Gen4/Gen5.
  • Check BIOS Performance Mode settings.
  • Use iblinkinfo to check for Symbol Errors on the physical link.

AI-HPC Organization