NVIDIA Data Processing Units

From Beam Line Controls
Jump to navigation Jump to search

Introduction

NVIDIA DPUs are expansion cards which allow for offloading of certain network-traffic related tasks from the host CPU. They comprise of an ARM CPU, memory, and high-speed ConnectX NIC on the same board. Network packets to/from the host can be manipulated by the ARM CPU using programs written with the DPDK or DOCA SDKs. The DPU ARM system runs its own OS; as shipped by NVIDIA, this is currently Ubuntu Linux.

DPU host software setup

Bluefield DPU Administrator Quick Start Guide (NVIDIA)

On a RHEL8 machine, first install the RPM package which contains the DOCA and DPU-related packages. This includes both a local copy of the necessary RPMs and enables a YUM repo for updates.

$ wget https://www.mellanox.com/downloads/DOCA/DOCA_v1.5.1/doca-host-repo-rhel86-1.5.1-0.1.8.1.5.1007.1.el8.5.8.1.1.2.1.x86_64.rpm
# yum install ./doca-host-repo-rhel86-1.5.1-0.1.8.1.5.1007.1.el8.5.8.1.1.2.1.x86_64.rpm

We find that the NVIDIA repos tend to timeout when accessed from the APS, so add to the end of /etc/yum.conf:

minrate=10 
timeout=300

Then install the necessary RPMs, allowing for downgrades and package removals:

# yum makecache 
# yum install --allowerasing --nobest doca-runtime doca-tools pv

rshim is a userspace tool which allows for configuration of NVIDIA Mellanox cards. Ensure rshim is running with systemctl status rshim (look for "loaded" and "enabled").

mst, or Mellanox Software Tools, is a userspace program which creates a device tree used for configuration.

# mst start 
Starting MST (Mellanox Software Tools) driver set 
Loading MST PCI module - Success 
Loading MST PCI configuration module - Success 
Create devices 
Unloading MST PCI module (unused) - Success 
# mst status -v 
MST modules: 
------------ 
    MST PCI module is not loaded 
    MST PCI configuration module loaded 
PCI devices: 
------------ 
DEVICE_TYPE             MST                           PCI       RDMA            NET                       NUMA   
BlueField2(rev:1)       /dev/mst/mt41686_pciconf0     ca:00.0   mlx5_0          net-ib0                   1

Get a new Bluefield OS system image:

$ wget https://content.mellanox.com/BlueField/BFBs/Ubuntu20.04/DOCA_1.5.1_BSP_3.9.3_Ubuntu_20.04-4.2211-LTS.signed.bfb

and install it:

# bfb-install --bfb DOCA_1.5.1_BSP_3.9.3_Ubuntu_20.04-4.2211-LTS.signed.bfb --rshim rshim0
Collecting BlueField booting status. Press Ctrl+C to stop… 
INFO[BL2]: start 
INFO[BL2]: DDR POST passed 
INFO[BL2]: UEFI loaded 
INFO[BL31]: start 
INFO[BL31]: runtime 
INFO[UEFI]: UPVS valid 
INFO[UEFI]: eMMC init 
INFO[UEFI]: eMMC probed 
INFO[UEFI]: PMI: updates started 
INFO[UEFI]: PMI: boot image update 
INFO[UEFI]: PMI: updates completed, status 0 
INFO[UEFI]: PCIe enum start 
INFO[UEFI]: PCIe enum end 
INFO[MISC]: Ubuntu installation started 
INFO[MISC]: Installing OS image 
INFO[MISC]: Installation finished

Only if DPU remote access (LAN/internet) is required, enable ip routing on host: add to /etc/sysctl.d/50-dpu.conf

net.ipv4.conf.all.forwarding = 1 
net.ipv6.conf.all.forwarding = 1

and setup IPv4 masquerading via nftables:

# nft add table nat 
# nft -- add chain nat prerouting { type nat hook prerouting priority -100 \; } 
# nft -- add chain nat postrouting { type nat hook postrouting priority 100 \; } 
# nft add rule nat postrouting oifname "ens6f0" snat to $(host IP) 
# nft list ruleset > /etc/nftables/dpu_nat.nft 
# echo "include "/etc/nftables/dpu_nat.nft" >> /etc/sysconfig/nftables.conf 
# systemctl enable nftables.service

DPU login and configuration

First set up an appropriate IP configuration for the tmfifo_net0 interface on the host. The DPU is factory-configured at 192.168.100.2.

# nmcli conn add type tun mode tap con-name tmfifo_net0 ifname tmfifo_net0 autoconnect yes ip4 192.168.100.1/24 ipv4.never-default true 
# nmci conn up tmfifo_net0

then login via ssh:

$ ssh [email protected]

You should receive the Ubuntu OS login prompt, and will be prompted to update the user password.

You can now update the DPU firmware:

# sudo /opt/mellanox/mlnx-fw-updater/mlnx_fw_updater.pl

and configure it:

# sudo mst start
# sudo mlxconfig -d /dev/mst/mt41686_pciconf0 -y reset   # Reset all settings
# sudo mlxconfig -d /dev/mst/mt41686_pciconf0 s LINK_TYPE_P1=2    # Set port 1 to Ethernet mode (not Infiniband)

Reboot the host and ensure settings persist.

References