Skip to content

VersatusHPC/docaofed-patch

 
 

Repository files navigation

docaofed-patch

Builds a small compatibility RPM for DOCA OFED that restores libibverbs provider registration for EFA and mlx4 on Enterprise Linux 8, 9, and 10.

DOCA OFED already ships the userspace libraries in libibverbs:

  • /usr/lib64/libefa.so.1
  • /usr/lib64/libmlx4.so.1

The missing pieces are the provider registration files and provider ABI symlinks:

  • /etc/libibverbs.d/efa.driver
  • /etc/libibverbs.d/mlx4.driver
  • /usr/lib64/libibverbs/libefa-rdmavNN.so
  • /usr/lib64/libibverbs/libmlx4-rdmavNN.so

This project intentionally does not repack NVIDIA RPMs. It generates a separate overlay package that can be installed after DOCA OFED userspace packages.

Requirements

  • Enterprise Linux 8, 9, or 10 on an architecture shipped by the DOCA OFED repository
  • DOCA OFED rdma-core and libibverbs already installed
  • rpmbuild
  • root privileges only if installing the generated RPM

Usage

Build the compatibility RPM:

./build-docaofed-provider-compat.sh

Build and install it:

INSTALL_AFTER_BUILD=1 ./build-docaofed-provider-compat.sh

The built RPM is written to $HOME/PATCHED-DOCA-OFED by default.

The generated package is tied to the installed DOCA OFED libibverbs and rdma-core package EVRs. Rebuild the compatibility RPM after updating DOCA OFED.

Support Policy

Validated support is for the public DOCA repositories listed below. The builder is intentionally version-adaptive: it reads the installed DOCA OFED libibverbs provider ABI and requires the exact installed libibverbs and rdma-core EVRs. Other DOCA repositories may work if they have the same missing EFA/mlx4 registration layout, but they are not claimed as supported until their repository layout and VM install path are validated.

DOCA repository Baseline file EL targets Provider ABI
latest tests/doca-latest-supported.tsv EL8, EL9, EL10 rdmav59
latest-3.2-LTS tests/doca-3.2-lts-supported.tsv EL8, EL9, EL10 rdmav57

Use DOCA_REPO_ROOT to point the layout verifier or libvirt test harness at a specific published DOCA repository when validating a non-latest release.

Verification

Check the current public DOCA repository layout for EL8, EL9, and EL10 on every architecture directory published by NVIDIA:

tests/verify-doca-rpm-layout.sh

Set DOCA_ARCHES to restrict the layout check to specific repository directories:

DOCA_ARCHES="x86_64 ppc64le" tests/verify-doca-rpm-layout.sh

The scheduled upstream watcher checks the public DOCA root index for every repository directory beginning with latest and opens a GitHub issue when NVIDIA adds or removes one. Newly published latest* repositories are automatically layout-verified before the issue is opened; the workflow stays green when the compatibility preconditions still hold and fails only when the new repository cannot be validated or a known repository disappears. The watcher also compares each supported DOCA repository layout against its baseline file and opens a GitHub issue when NVIDIA changes a supported baseline.

The testing/libvirt/remote-doca-vm-test.sh harness can be copied to a libvirt host and used with Rocky cloud images to run end-to-end EL8, EL9, and EL10 build and install validation. It supports x86_64 and ppc64le, and has best-effort aarch64 support through the DOCA arm64-sbsa repository directory. Set DOCA_REPO_ARCH to override the repository directory if needed.

Validated against the public DOCA latest repository on 2026-04-11:

Target Repository directory libibverbs RPM Provider ABI
EL8 aarch64 arm64-sbsa libibverbs-2601.0.7-1.el8.aarch64.rpm rdmav59
EL8 aarch64 sbsa-arm64 libibverbs-2601.0.7-1.el8.aarch64.rpm rdmav59
EL8 ppc64le ppc64le libibverbs-2601.0.7-1.el8.ppc64le.rpm rdmav59
EL8 x86_64 x86_64 libibverbs-2601.0.7-1.el8.x86_64.rpm rdmav59
EL9 aarch64 arm64-sbsa libibverbs-2601.0.7-1.el9.aarch64.rpm rdmav59
EL9 aarch64 sbsa-arm64 libibverbs-2601.0.7-1.el9.aarch64.rpm rdmav59
EL9 ppc64le ppc64le libibverbs-2601.0.7-1.el9.ppc64le.rpm rdmav59
EL9 x86_64 x86_64 libibverbs-2601.0.7-1.el9.x86_64.rpm rdmav59
EL10 aarch64 arm64-sbsa libibverbs-2601.0.7-1.el10.aarch64.rpm rdmav59
EL10 aarch64 sbsa-arm64 libibverbs-2601.0.7-1.el10.aarch64.rpm rdmav59
EL10 ppc64le ppc64le libibverbs-2601.0.7-1.el10.ppc64le.rpm rdmav59
EL10 x86_64 x86_64 libibverbs-2601.0.7-1.el10.x86_64.rpm rdmav59

Validated against the public DOCA latest-3.2-LTS repository on 2026-04-11:

Target Repository directory libibverbs RPM Provider ABI
EL8 aarch64 arm64-sbsa libibverbs-2510.0.11-1.el8.aarch64.rpm rdmav57
EL8 aarch64 sbsa-arm64 libibverbs-2510.0.11-1.el8.aarch64.rpm rdmav57
EL8 ppc64le ppc64le libibverbs-2510.0.11-1.el8.ppc64le.rpm rdmav57
EL8 x86_64 x86_64 libibverbs-2510.0.11-1.el8.x86_64.rpm rdmav57
EL9 aarch64 arm64-sbsa libibverbs-2510.0.11-1.el9.aarch64.rpm rdmav57
EL9 aarch64 sbsa-arm64 libibverbs-2510.0.11-1.el9.aarch64.rpm rdmav57
EL9 ppc64le ppc64le libibverbs-2510.0.11-1.el9.ppc64le.rpm rdmav57
EL9 x86_64 x86_64 libibverbs-2510.0.11-1.el9.x86_64.rpm rdmav57
EL10 aarch64 arm64-sbsa libibverbs-2510.0.11-1.el10.aarch64.rpm rdmav57
EL10 aarch64 sbsa-arm64 libibverbs-2510.0.11-1.el10.aarch64.rpm rdmav57
EL10 ppc64le ppc64le libibverbs-2510.0.11-1.el10.ppc64le.rpm rdmav57
EL10 x86_64 x86_64 libibverbs-2510.0.11-1.el10.x86_64.rpm rdmav57

The arm64-sbsa and sbsa-arm64 repository directories currently contain the same aarch64 package payloads. Both are validated because NVIDIA publishes both names.

End-to-end libvirt VM validation was performed for EL8, EL9, and EL10 on x86_64 and ppc64le for DOCA latest and DOCA latest-3.2-LTS. Other architectures are covered by repository payload layout validation only.

Open Source Apache License

This shell script is made available under the Apache License, Version 2.0: https://www.apache.org/licenses/LICENSE-2.0

About

Shell script to patch DOCA OFED to add back support for MLX4 and EFA

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Shell 100.0%