InfiniBand¶
InfiniBand is a kind of interconnection widely used in supercomputers. It features high throughput and low latency. We use InfiniBand cards manufactured by Mellanox.
Driver and system setup¶
Download latest MLNX_OFED driver (driver download link), decompress it and cd into the directory.
Note
There are open-source drivers available. We use the driver provided by Mellanox here.
Before installation proceeds, install some required libraries first.
yum install tcl tk
Run install script. This might take some time.
./mlnxofedinstall
If required, unload old driver and load new driver.
modprobe -rv ib_isert rpcrdma ib_srpt
/etc/init.d/openibd restart
Start opensm (InfiniBand subnet manager).
systemctl enable opensmd --now
Warning
Start opensmd on one of the cluster node only. If multiple nodes start opensmd at the same time, it may cause problem.
Check InfiniBand status.
ibstat
# State: Active
# Physical state: LinkUp
ibstatus
IPoIB¶
We can use InfiniBand as usual network interface.
ifconfig¶
Create a ifconfig script in /etc/sysconfig/network/scripts.
TYPE=InfiniBand
BOOTPROTO=static
IPV6_AUTOCONF=no
NAME=ib0
DEVICE=ib0
ONBOOT=yes
IPADDR=10.18.18.1
PREFIX=24
CONNECTED_MODE=yes
MTU=65520
Here we enable connected mode to maximize performance.
Restart network.
systemctl restart network
systemd-networkd¶
Create a config file in /etc/systemd/network.
[Match]
Name=ib0
[Network]
Address=10.18.18.1/24
Restart systemd-networkd.
systemctl restart systemd-networkd
To enable connected mode, we can use this AUR package. Remove the rdma.service dependency in ipoibmodemtu.service, since we are using Mellanox’s driver.
Copy the files to their specified path, then start systemd service.
cp ipoibmodemtu /usr/bin/ipoibmodemtu
cp ipoibmodemtu.conf /etc/ipoibmodemtu.conf
cp ipoibmodemtu.service /usr/lib/systemd/system/ipoibmodemtu.service
systemctl daemon-reload
systemctl enable ipoibmodemtu.service --now
Mellanox OFED GPUDirect RDMA¶
Not included in Mellanox driver. Get the source from github.
git clone https://github.com/Mellanox/nv_peer_memory.git
Build kernel module.
./build_module.sh
Install using rpm.
rpmbuild --rebuild /tmp/nvidia_peer_memory-*.src.rpm
rpm -ivh $HOME/rpmbuild/RPMS/x86_64/nvidia_peer_memory-*.x86_64.rpm
Start system service.
systemctl enable nv_peer_mem --now