diff --git a/.wordlist.txt b/.wordlist.txt
index 0356d26..3587820 100644
--- a/.wordlist.txt
+++ b/.wordlist.txt
@@ -17,6 +17,8 @@ ARI
APBDIS
api
APIC
+APU
+APUs
args
arp
aspm
@@ -137,6 +139,7 @@ GFLOPs
gfortran
gfx
GFX
+GiB
GMI
GPUP
GPUs
@@ -160,6 +163,7 @@ HPE
HPL
HSA
hugepage
+hugepages
ib
ibdiagnet
ibstat
@@ -209,6 +213,7 @@ lsmem
lsof
lspci
LTS
+LUMI
lvl
makefile
maxBytes
diff --git a/docs/conf.py b/docs/conf.py
index 713f896..c2a05ef 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -13,7 +13,7 @@
# Disable external projects to avoid GitHub API issues
external_projects_remote_repository = ""
-external_projects_current_project = os.environ.get("SPHINX_PROJECT_SLUG", "system-acceptance-docs")
+external_projects_current_project = os.environ.get("SPHINX_PROJECT_SLUG", "system-acceptance")
version = "1.0.0"
release = version
diff --git a/docs/gpus/mi100.md b/docs/gpus/mi100.md
new file mode 100644
index 0000000..d6b64ee
--- /dev/null
+++ b/docs/gpus/mi100.md
@@ -0,0 +1,174 @@
+---
+myst:
+ html_meta:
+ "description": "AMD Instinct MI100 acceptance criteria — prerequisites, health checks, system validation, and performance benchmarks for CDNA PCIe GPU platforms."
+ "keywords": "MI100, AMD Instinct, CDNA, ROCm, PCIe, Infinity Fabric, system acceptance, validation, benchmarks, HBM2"
+---
+
+# AMD Instinct MI100
+
+The AMD Instinct™ MI100 is a data-center compute PCIe-form-factor GPU. This document provides MI100-specific prerequisites, health checks, validation steps, and performance acceptance criteria.
+
+## Overview
+
+The AMD Instinct MI100 introduces the first-generation CDNA architecture in a standard full-height, full-length, dual-slot PCIe® add-in card aimed at HPC and accelerated computing workloads. Each MI100 provides 120 compute units with Matrix Core technology, 32 GB of HBM2 memory at up to 1.2 TB/s, and AMD Infinity Fabric™ link support for direct GPU-to-GPU connectivity in 2- and 4-GPU hive configurations. The card is passively cooled with a 300 W TDP and supports PCIe® Gen4 host connectivity.
+
+The MI100 is built on the CDNA architecture (gfx908) with 120 compute units and 32 GB of HBM2 memory per GPU. The MI100 Infinity Fabric™ topology tops out at 4 GPUs per hive, so the validation reference configuration for this document is a single 4-GPU MI100 hive with Infinity Fabric™ bridges providing direct GPU-to-GPU connectivity across all peers. Larger deployments (for example, dual-socket servers with two 4-GPU hives for 8 MI100s total) are common; in those systems, cross-hive traffic traverses the host PCIe fabric and the per-hive criteria below apply to each hive independently.
+
+- **[MI100 Product Page](https://www.amd.com/en/products/accelerators/instinct/mi100.html)**
+- **[MI100 Product Brief](https://www.amd.com/content/dam/amd/en/documents/instinct-business-docs/product-briefs/instinct-mi100-brochure.pdf)**
+- **[MI100 Microarchitecture](https://instinct.docs.amd.com/latest/gpu-arch/mi100.html)**
+
+## System requirements
+
+### Operating system support
+
+For the most up-to-date information on supported operating systems and distributions, refer to the official ROCm documentation:
+
+[ROCm System Requirements - Supported Distributions](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-distributions)
+
+```{note}
+[ROCm docs](https://rocm.docs.amd.com) is the single source of truth for supported versions, distribution compatibility, and required dependencies for the ROCm toolkit.
+```
+
+For BIOS, NUMA, and OS-level tuning that applies to all AMD Instinct hosts, see [BIOS settings](../common/bios-settings.md) and [OS tuning](../common/os-tuning.md).
+
+### GPU identification
+
+All MI100 GPUs (PCI vendor:device `1002:738c`) should appear in `lspci` output:
+
+```bash
+sudo lspci -d 1002:738c
+```
+
+Expected output example (4-GPU MI100 hive):
+
+```bash
+1d:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Arcturus GL-XL [Instinct MI100] (rev 01)
+20:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Arcturus GL-XL [Instinct MI100] (rev 01)
+23:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Arcturus GL-XL [Instinct MI100] (rev 01)
+26:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Arcturus GL-XL [Instinct MI100] (rev 01)
+```
+
+## Acceptance criteria
+
+The MI100 system acceptance process validates that the platform is correctly configured, stable, and performing to expectations. Follow the sequence: Prerequisites → Basic Health Checks → System Validation → Performance Benchmarks.
+
+### System acceptance process
+
+1. **[Prerequisites validation](#prerequisites-validation)** - Ensure all system requirements and dependencies are met
+2. **[Basic health checks](#basic-health-checks)** - Verify hardware detection and basic system health
+3. **[System validation](#system-validation)** - Conduct comprehensive stress testing and qualification
+4. **[Performance benchmarks](#performance-benchmarks)** - Validate compute, memory, and interconnect performance
+
+The system is accepted when all criteria below are successfully validated.
+
+### Prerequisites validation
+
+Ensure all system requirements are met before proceeding with validation. See the [Prerequisites documentation](../common/prerequisites.md) and [System setup](../common/system-setup.md) for more details.
+
+- ✅ Supported operating system version installed
+- ✅ Compatible ROCm version installed
+- ✅ BIOS configured per [BIOS settings](../common/bios-settings.md), with MI100-specific values per platform vendor
+- ✅ Required kernel parameters present: `pci=realloc=off`, `pci=bfsort`, `iommu=pt`, and `amd_iommu=on` (or `intel_iommu=on` on Intel hosts) — see [Kernel Parameters](../common/kernel-parameters.md)
+- ✅ Minimum 256G system memory available
+- ✅ Latest applicable firmware applied consistently across nodes
+- ✅ ROCm Validation Suite (RVS) installed
+
+### Basic health checks
+
+These checks ensure fundamental system health and proper GPU detection. For detailed procedures, see [Health Checks](../common/health-checks.md).
+
+| Test | Command | Pass/Fail criteria |
+|------|---------|-------------------|
+| [Check OS distribution](../common/health-checks.md#check-os-distribution) | `cat /etc/os-release` | **Pass**: OS version listed in compatibility matrix
**Fail**: Otherwise |
+| [Check kernel boot arguments](../common/health-checks.md#check-kernel-boot-arguments) | `cat /proc/cmdline` | **Pass**: Contains `pci=realloc=off`, `pci=bfsort`, `iommu=pt`, and `amd_iommu=on` or `intel_iommu=on`
**Fail**: Otherwise |
+| [Check for driver errors](../common/health-checks.md#check-for-driver-errors) | `sudo dmesg -T \| grep amdgpu \| grep -i error` | **Pass**: Null
**Fail**: Errors reported |
+| [Check available memory](../common/health-checks.md#check-for-available-system-memory) | `lsmem \| grep "Total online memory"` | **Pass**: ≥ 256G
**Fail**: Less than 256G |
+| [Check GPU presence](../common/health-checks.md#check-gpu-presence) | `sudo lspci -d 1002:738c` | **Pass**: 4 MI100 GPUs found (per hive)
**Fail**: Otherwise |
+| [Check GPU link speed and width](../common/health-checks.md#check-gpu-pcie-bus-link-speed-and-width) | `sudo lspci -d 1002:738c -vvv \| grep -e DevSta -e LnkSta` | **Pass**: Speed 16GT/s, width `x16`, no `FatalErr+`
**Fail**: Otherwise |
+| [Monitor utilization metrics](../common/health-checks.md#monitor-utilization-metrics) | `amd-smi monitor -putm` | **Pass**: Idle metrics as specified
**Fail**: Otherwise |
+| [Check system kernel logs for errors](../common/health-checks.md#check-system-kernel-logs) | `sudo dmesg -T \| grep -i 'error\|warn\|fail\|exception'` | **Pass**: Null
**Fail**: Otherwise |
+
+### System validation
+
+Comprehensive validation ensures system stability under load. For detailed procedures, see [System Validation](../common/system-validation.md).
+
+| Test | Command | Pass/Fail criteria |
+|------|---------|-------------------|
+| [Compute/GPU properties](../common/system-validation.md#gpu-properties) | `rvs -c ${RVS_CONF}/gpup_single.conf` | **Pass**: All GPUs listed with no errors
**Fail**: Missing GPUs or errors |
+| [GPU stress test (GST)](../common/system-validation.md#gpu-stress-test) | `rvs -c ${RVS_CONF}/MI100/gst_single.conf` | **Pass**: `met: TRUE` in logs
**Fail**: Target GFLOP/s not met |
+| [Input energy delay product (IET)](../common/system-validation.md#input-energy-delay-product) | `rvs -c ${RVS_CONF}/MI100/iet_single.conf` | **Pass**: `met: TRUE` for all actions
**Fail**: Otherwise |
+| [Memory test (MEM)](../common/system-validation.md#mem) | `rvs -c ${RVS_CONF}/mem.conf -l mem.txt` | **Pass**: All tests passed; bandwidth ≥ 800 GB/s per GPU
**Fail**: Any test failed or low bandwidth |
+| [PCIe bandwidth benchmark (PEBB)](../common/system-validation.md#pcie-bandwidth-benchmark) | `rvs -c ${RVS_CONF}/MI100/pebb_single.conf` | **Pass**: All distances and bandwidths displayed
**Fail**: Missing data |
+| [PCIe qualification tool (PEQT)](../common/system-validation.md#pcie-qualification-tool) | `rvs -c ${RVS_CONF}/peqt_single.conf` | **Pass**: All actions true
**Fail**: Otherwise |
+| [P2P benchmark and qualification tool (PBQT)](../common/system-validation.md#p2p-benchmark-and-qualification-tool) | `rvs -c ${RVS_CONF}/pbqt_single.conf` | **Pass**: `peers:true` lines and non-zero throughput
**Fail**: Otherwise |
+
+```{note}
+The reference configuration for this document is a single 4-GPU MI100 hive with AMD Infinity Fabric™ bridges installed, so intra-hive PBQT and TransferBench numbers reflect XGMI throughput. On systems without bridges, P2P traffic traverses the host PCIe fabric and these thresholds will not be met.
+```
+
+### Performance benchmarks
+
+Performance validation ensures the system meets MI100 specifications. For detailed procedures, see [Performance Benchmarking](../common/system-validation.md#performance-benchmarking).
+
+:::{card} Command: `TransferBench a2a`
+[TransferBench all-to-all](../common/system-validation.md#transferbench)
+^^^
+**Pass:** ≥ 270 GB/s aggregate
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `TransferBench p2p`
+[TransferBench peer-to-peer](../common/system-validation.md#transferbench)
+^^^
+
+| Test | Pass criteria |
+|------|--------------|
+| UniDir | ≥ 30 GB/s |
+| BiDir | ≥ 57 GB/s |
+
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `build/all_reduce_perf -b 8 -e 8G -f 2 -g 4`
+[RCCL Allreduce](../common/system-validation.md#rccl-allreduce)
+^^^
+**Pass:** ≥ 72 GB/s busbw (peak, at 8 GiB message size)
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `rocblas-bench` (see code block below)
+[rocBLAS FP32](../common/system-validation.md#rocblas-gemm-benchmarks)
+^^^
+
+```bash
+rocblas-bench -f gemm \
+ -r s -m 4000 -n 4000 -k 4000 \
+ --lda 4000 --ldb 4000 --ldc 4000 \
+ --transposeA N --transposeB T
+```
+
+**Pass:** ≥ 28 TFLOPS per GPU
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `mpiexec -n 4 wrapper.sh`
+[BabelStream](../common/system-validation.md#babelstream)
+^^^
+
+| Kernel | Threshold (MB/s) |
+|--------|-----------------|
+| Copy | ≥ 940,000 |
+| Mul | ≥ 940,000 |
+| Add | ≥ 910,000 |
+| Triad | ≥ 910,000 |
+| Dot | ≥ 950,000 |
+
++++
+**Fail:** otherwise
+:::
diff --git a/docs/gpus/mi210.md b/docs/gpus/mi210.md
new file mode 100644
index 0000000..cdb4dc9
--- /dev/null
+++ b/docs/gpus/mi210.md
@@ -0,0 +1,178 @@
+---
+myst:
+ html_meta:
+ "description": "MI210 GPU system acceptance guide: prerequisites, health checks, system validation, and performance benchmarks for HPC and AI deployments."
+ "keywords": "AMD Instinct MI210, GPU acceptance testing, ROCm, HPC, AI, PCIe GPU, system validation, health checks, BabelStream, rocBLAS, RCCL, TransferBench, CDNA2"
+---
+# AMD Instinct MI210
+
+The AMD Instinct™ MI210 GPU is a mainstream HPC and AI PCIe-form-factor accelerator. This document provides MI210-specific prerequisites, health checks, validation steps, and performance acceptance criteria.
+
+## Overview
+
+The AMD Instinct MI210 brings second-generation CDNA architecture to a standard full-height, full-length, dual-slot PCIe® add-in card aimed at single-server HPC and AI deployments. Each MI210 provides 104 compute units, 64 GB of HBM2e memory at up to 1.6 TB/s of memory bandwidth, and up to three AMD Infinity Fabric™ links that enable direct GPU-to-GPU connectivity in dual- and quad-GPU hive configurations. The card is passively cooled with a 300 W TDP and supports PCIe® Gen4 host connectivity.
+
+The MI210 is built on AMD CDNA 2 architecture (gfx90a) in a PCIe add-in-card form factor with 104 compute units and 64 GB of HBM2e memory per accelerator. Unlike the OAM-based AMD Instinct MI250 and MI250X, MI210 deployments are PCIe-attached; GPU-to-GPU traffic uses AMD Infinity Fabric™ links when an Infinity Fabric bridge is installed, otherwise it traverses host PCIe.
+
+- **[MI210 Product Page](https://www.amd.com/en/products/accelerators/instinct/mi200/mi210.html)**
+- **[MI200 Series Data Sheet](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instinct-mi200-datasheet.pdf)**
+- **[MI200 Series Microarchitecture](https://instinct.docs.amd.com/latest/gpu-arch/mi250.html)**
+
+## System requirements
+
+### Operating system support
+
+For the most up-to-date information on supported operating systems and distributions, see the official ROCm documentation:
+
+[ROCm System Requirements - Supported Distributions](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-distributions)
+
+```{note}
+[ROCm docs](https://rocm.docs.amd.com) is the single source of truth for supported versions, distribution compatibility, and required dependencies for the ROCm toolkit.
+```
+
+For BIOS, NUMA, and OS-level tuning that applies to all AMD Instinct hosts, see [BIOS settings](../common/bios-settings.md) and [OS tuning](../common/os-tuning.md). MI210 systems share the general OS and IOMMU guidance documented for other CDNA 2 platforms but might differ in BIOS power and xGMI topology settings; consult your platform vendor's BIOS guide for MI210-specific values.
+
+### GPU identification
+
+All MI210 GPUs (PCI vendor:device `1002:740f`) should appear in `lspci` output:
+
+```bash
+sudo lspci -d 1002:740f
+```
+
+Expected output example:
+
+```bash
+03:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran/MI200 [Instinct MI210] (rev 02)
+27:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran/MI200 [Instinct MI210] (rev 02)
+43:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran/MI200 [Instinct MI210] (rev 02)
+63:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran/MI200 [Instinct MI210] (rev 02)
+83:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran/MI200 [Instinct MI210] (rev 02)
+a3:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran/MI200 [Instinct MI210] (rev 02)
+c3:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran/MI200 [Instinct MI210] (rev 02)
+e3:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran/MI200 [Instinct MI210] (rev 02)
+```
+
+## Acceptance criteria
+
+The MI210 system acceptance process validates that the platform is correctly configured, stable, and performing to expectations. Follow the sequence: Prerequisites → Basic Health Checks → System Validation → Performance Benchmarks.
+
+### System acceptance process
+
+1. **[Prerequisites validation](#prerequisites-validation)** - Ensure all system requirements and dependencies are met
+2. **[Basic health checks](#basic-health-checks)** - Verify hardware detection and basic system health
+3. **[System validation](#system-validation)** - Conduct comprehensive stress testing and qualification
+4. **[Performance benchmarks](#performance-benchmarks)** - Validate compute, memory, and interconnect performance
+
+The system is accepted when all criteria below are successfully validated.
+
+### Prerequisites validation
+
+Ensure all system requirements are met before proceeding with validation. See the [Prerequisites documentation](../common/prerequisites.md) and [System setup](../common/system-setup.md) for more details.
+
+- ✅ Supported operating system version installed
+- ✅ Compatible ROCm version installed
+- ✅ BIOS configured per [BIOS settings](../common/bios-settings.md), with MI210-specific values per platform vendor
+- ✅ Required kernel parameters present: `pci=realloc=off`, `pci=bfsort`, `iommu=pt`, and `amd_iommu=on` (or `intel_iommu=on` on Intel hosts) — see [Kernel Parameters](../common/kernel-parameters.md)
+- ✅ Minimum 512G system memory available
+- ✅ Latest applicable firmware applied consistently across nodes
+- ✅ ROCm Validation Suite (RVS) installed
+
+### Basic health checks
+
+These checks ensure fundamental system health and proper GPU detection. For detailed procedures, see [Health Checks](../common/health-checks.md).
+
+| Test | Command | Pass/Fail criteria |
+|------|---------|-------------------|
+| [Check OS distribution](../common/health-checks.md#check-os-distribution) | `cat /etc/os-release` | **Pass**: OS version listed in compatibility matrix
**Fail**: Otherwise |
+| [Check kernel boot arguments](../common/health-checks.md#check-kernel-boot-arguments) | `cat /proc/cmdline` | **Pass**: Contains `pci=realloc=off`, `pci=bfsort`, `iommu=pt`, and `amd_iommu=on` or `intel_iommu=on`
**Fail**: Otherwise |
+| [Check for driver errors](../common/health-checks.md#check-for-driver-errors) | `sudo dmesg -T \| grep amdgpu \| grep -i error` | **Pass**: Null
**Fail**: Errors reported |
+| [Check available memory](../common/health-checks.md#check-for-available-system-memory) | `lsmem \| grep "Total online memory"` | **Pass**: ≥ 512G
**Fail**: Less than 512G |
+| [Check GPU presence](../common/health-checks.md#check-gpu-presence) | `sudo lspci -d 1002:740f` | **Pass**: 4 MI210 GPUs found
**Fail**: Otherwise |
+| [Check GPU link speed and width](../common/health-checks.md#check-gpu-pcie-bus-link-speed-and-width) | `sudo lspci -d 1002:740f -vvv \| grep -e DevSta -e LnkSta` | **Pass**: Speed 16GT/s, width `x16`, no `FatalErr+`
**Fail**: Otherwise |
+| [Monitor utilization metrics](../common/health-checks.md#monitor-utilization-metrics) | `amd-smi monitor -putm` | **Pass**: Idle metrics as specified
**Fail**: Otherwise |
+| [Check system kernel logs for errors](../common/health-checks.md#check-system-kernel-logs) | `sudo dmesg -T \| grep -i 'error\|warn\|fail\|exception'` | **Pass**: Null
**Fail**: Otherwise |
+
+### System validation
+
+Comprehensive validation ensures system stability under load. For detailed procedures, see [System Validation](../common/system-validation.md).
+
+| Test | Command | Pass/Fail criteria |
+|------|---------|-------------------|
+| [Compute/GPU properties](../common/system-validation.md#gpu-properties) | `rvs -c ${RVS_CONF}/gpup_single.conf` | **Pass**: All GPUs listed with no errors
**Fail**: Missing GPUs or errors |
+| [GPU stress test (GST)](../common/system-validation.md#gpu-stress-test) | `rvs -c ${RVS_CONF}/MI210/gst_single.conf` | **Pass**: `met: TRUE` in logs
**Fail**: Target GFLOP/s not met |
+| [Input energy delay product (IET)](../common/system-validation.md#input-energy-delay-product) | `rvs -c ${RVS_CONF}/MI210/iet_single.conf` | **Pass**: `met: TRUE` for all actions
**Fail**: Otherwise |
+| [Memory test (MEM)](../common/system-validation.md#mem) | `rvs -c ${RVS_CONF}/mem.conf -l mem.txt` | **Pass**: All tests passed; bandwidth ~1.1TB/s per GPU
**Fail**: Any test failed or low bandwidth |
+| [PCIe bandwidth benchmark (PEBB)](../common/system-validation.md#pcie-bandwidth-benchmark) | `rvs -c ${RVS_CONF}/MI210/pebb_single.conf` | **Pass**: All distances and bandwidths displayed
**Fail**: Missing data |
+| [PCIe qualification tool (PEQT)](../common/system-validation.md#pcie-qualification-tool) | `rvs -c ${RVS_CONF}/peqt_single.conf` | **Pass**: All actions true
**Fail**: Otherwise |
+| [P2P benchmark and qualification tool (PBQT)](../common/system-validation.md#p2p-benchmark-and-qualification-tool) | `rvs -c ${RVS_CONF}/pbqt_single.conf` | **Pass**: `peers:true` lines and non-zero throughput
**Fail**: Otherwise |
+
+### Performance benchmarks
+
+Performance validation ensures the system meets MI210 specifications. For detailed procedures, see [Performance Benchmarking](../common/system-validation.md#performance-benchmarking).
+
+:::{card} Command: `TransferBench a2a`
+[TransferBench all-to-all](../common/system-validation.md#transferbench)
+^^^
+**Pass:** ≥ 80 GB/s per GPU aggregate
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `TransferBench p2p`
+[TransferBench peer-to-peer](../common/system-validation.md#transferbench)
+^^^
+
+| Test | Pass Criteria |
+|------|--------------|
+| UniDir | ≥ 35 GB/s per same-socket peer-pair |
+| BiDir | ≥ 65 GB/s per same-socket peer-pair (combined) |
+
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `build/all_reduce_perf -b 8 -e 8G -f 2 -g `
+[RCCL Allreduce](../common/system-validation.md#rccl-allreduce)
+^^^
+
+| Config | Pass Criteria |
+|--------|--------------|
+| `-g 4` (single-socket quad) | ≥ 30 GB/s avg bus bandwidth |
+| `-g 8` (dual-socket, cross-socket ring) | ≥ 8 GB/s avg bus bandwidth |
+
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `rocblas-bench` (see code block below)
+[rocBLAS FP32](../common/system-validation.md#rocblas-gemm-benchmarks)
+^^^
+
+```bash
+rocblas-bench -f gemm \
+ -r s -m 4000 -n 4000 -k 4000 \
+ --lda 4000 --ldb 4000 --ldc 4000 \
+ --transposeA N --transposeB T
+```
+
+**Pass:** ≥ 28000 GFLOPS
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `mpiexec -n 4 wrapper.sh`
+[BabelStream](../common/system-validation.md#babelstream)
+^^^
+
+| Kernel | Threshold (MB/s) |
+|--------|-----------------|
+| Copy | ≥ 1,230,000 |
+| Mul | ≥ 1,225,000 |
+| Add | ≥ 1,115,000 |
+| Triad | ≥ 1,115,000 |
+| Dot | ≥ 1,170,000 |
+
++++
+**Fail:** otherwise
+:::
diff --git a/docs/gpus/mi250.md b/docs/gpus/mi250.md
new file mode 100644
index 0000000..a1bce0d
--- /dev/null
+++ b/docs/gpus/mi250.md
@@ -0,0 +1,179 @@
+---
+myst:
+ html_meta:
+ "description": "AMD Instinct MI250 acceptance criteria — prerequisites, health checks, system validation, and performance benchmarks for CDNA 2 OAM GPU platforms."
+ "keywords": "MI250, AMD Instinct, CDNA 2, OAM, ROCm, xGMI, Infinity Fabric, system acceptance, validation, benchmarks, HBM2e"
+---
+
+# AMD Instinct MI250 / MI250X
+
+The AMD Instinct™ MI250 is a data-center OAM-form-factor GPU. This document provides MI250-specific prerequisites, health checks, validation steps, and performance acceptance criteria. It also applies to the AMD Instinct™ MI250X, which shares the same CDNA 2 (gfx90a) OAM platform and acceptance criteria; MI250X-specific differences are noted inline.
+
+## Overview
+
+The AMD Instinct MI250 brings the second-generation CDNA architecture to an OCP Accelerator Module (OAM) form factor purpose-built for HPC and large-scale AI training. Each MI250 packages two Graphics Compute Dies (GCDs) under a single OAM, each GCD presenting 110 CUs with Matrix Core technology and 64 GB of HBM2e memory at up to 1.6 TB/s, for a combined 128 GB and 3.2 TB/s per OAM. The two GCDs on an OAM are linked by a high-bandwidth on-package AMD Infinity Fabric™ interconnect, and each OAM exposes additional xGMI ports for direct GPU-to-GPU connectivity across a 4-OAM all-to-all mesh. A typical qualified configuration hosts 4 MI250 OAMs (8 GCDs total) per node.
+
+The MI250 is built on the CDNA 2 architecture (gfx90a) in an OCP Accelerator Module (OAM) form factor. Each MI250 OAM hosts two Graphics Compute Dies (GCDs), each enumerated as an independent GPU by ROCm tools, with 128 GB of HBM2e memory per OAM (64 GB per GCD). GPUs are connected to each other and to the host CPUs through AMD Infinity Fabric™ (xGMI).
+
+The MI250X is the higher-performance variant of the same CDNA 2 (gfx90a) OAM platform and is validated using the criteria in this document. It powers exascale-class supercomputers such as Frontier and LUMI. MI250X reference deployments commonly use an 8-OAM (16-GCD) node topology; scale the per-node GCD counts in the commands below accordingly (for example, `-g 16` for RCCL and `mpiexec -n 16` for BabelStream on an 8-OAM node). MI250X also shares the MI250 PCI vendor:device ID (`1002:740c`).
+
+- **[MI250 Product Page](https://www.amd.com/en/products/accelerators/instinct/mi200/mi250.html)**
+- **[MI250X Product Page](https://www.amd.com/en/products/accelerators/instinct/mi200/mi250x.html)**
+- **[MI200 Series Microarchitecture](https://instinct.docs.amd.com/latest/gpu-arch/mi250.html)**
+- **[MI200 Series Data Sheet](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instinct-mi200-datasheet.pdf)**
+
+## System requirements
+
+### Operating system support
+
+For the most up-to-date information on supported operating systems and distributions, refer to the official ROCm documentation:
+
+[ROCm System Requirements - Supported Distributions](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-distributions)
+
+```{note}
+[ROCm docs](https://rocm.docs.amd.com) is the single source of truth for supported versions, distribution compatibility, and required dependencies for the ROCm toolkit.
+```
+
+For BIOS, NUMA, and OS-level tuning that applies to all AMD Instinct hosts, see [BIOS settings](../common/bios-settings.md) and [OS tuning](../common/os-tuning.md).
+
+### GPU identification
+
+All MI250 GCDs (PCI vendor:device `1002:740c`) should appear in `lspci` output. On a fully populated 4-OAM MI250 platform you should see 8 GCD entries (2 per OAM):
+
+```bash
+sudo lspci -d 1002:740c
+```
+
+Expected output example:
+
+```bash
+0000:11:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
+0000:14:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
+0000:32:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
+0000:35:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
+0000:8e:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
+0000:93:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
+0000:ae:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
+0000:b3:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Aldebaran (rev 01)
+```
+
+The 8 GCDs are paired by OAM (e.g. `11`+`14` are the two GCDs on one OAM, `32`+`35` on the next, and so on). Same-OAM GCD pairs are connected by a high-bandwidth on-package link, while cross-OAM connectivity uses external xGMI ports in a 4-OAM all-to-all mesh.
+
+## Acceptance criteria
+
+The MI250 system acceptance process validates that the platform is correctly configured, stable, and performing to expectations. Follow the sequence: Prerequisites → Basic Health Checks → System Validation → Performance Benchmarks.
+
+### System acceptance process
+
+1. **[Prerequisites validation](#prerequisites-validation)** - Ensure all system requirements and dependencies are met
+2. **[Basic health checks](#basic-health-checks)** - Verify hardware detection and basic system health
+3. **[System validation](#system-validation)** - Conduct comprehensive stress testing and qualification
+4. **[Performance benchmarks](#performance-benchmarks)** - Validate compute, memory, and interconnect performance
+
+The system is accepted when all criteria below are successfully validated.
+
+### Prerequisites validation
+
+Ensure all system requirements are met before proceeding with validation. See the [Prerequisites documentation](../common/prerequisites.md) and [System setup](../common/system-setup.md) for more details.
+
+- ✅ Supported operating system version installed
+- ✅ Compatible ROCm version installed (verify: `cat /opt/rocm/.info/version`); see the [ROCm System Requirements](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html) for the current supported version matrix
+- ✅ BIOS configured per [BIOS settings](../common/bios-settings.md), with MI250-specific values per platform vendor
+- ✅ Required kernel parameters present: `pci=realloc=off iommu=pt`
+- ✅ Minimum 1T system memory available
+- ✅ Latest applicable firmware applied consistently across nodes
+- ✅ ROCm Validation Suite (RVS) installed
+
+### Basic health checks
+
+These checks ensure fundamental system health and proper GPU detection. For detailed procedures, see [Health Checks](../common/health-checks.md).
+
+| Test | Command | Pass/Fail criteria |
+|------|---------|-------------------|
+| [Check OS distribution](../common/health-checks.md#check-os-distribution) | `cat /etc/os-release` | **Pass**: OS version listed in compatibility matrix
**Fail**: Otherwise |
+| [Check kernel boot arguments](../common/health-checks.md#check-kernel-boot-arguments) | `cat /proc/cmdline` | **Pass**: Contains `pci=realloc=off iommu=pt`
**Fail**: Otherwise |
+| [Check for driver errors](../common/health-checks.md#check-for-driver-errors) | `sudo dmesg -T \| grep amdgpu \| grep -i error` | **Pass**: Null
**Fail**: Errors reported |
+| [Check available memory](../common/health-checks.md#check-for-available-system-memory) | `lsmem \| grep "Total online memory"` | **Pass**: ≥ 1T
**Fail**: Less than 1T |
+| [Check GPU presence](../common/health-checks.md#check-gpu-presence) | `sudo lspci -d 1002:740c` | **Pass**: 8 MI250 GCDs found
**Fail**: Otherwise |
+| [Check GPU link speed and width](../common/health-checks.md#check-gpu-pcie-bus-link-speed-and-width) | `sudo lspci -d 1002:740c -vvv \| grep -e DevSta -e LnkSta` | **Pass**: Speed PCIe Gen 4 (16 GT/s), width `x16`, no `FatalErr+`
**Fail**: Otherwise |
+| [Monitor utilization metrics](../common/health-checks.md#monitor-utilization-metrics) | `amd-smi monitor -putm` | **Pass**: Idle metrics as specified
**Fail**: Otherwise |
+| [Check system kernel logs for errors](../common/health-checks.md#check-system-kernel-logs) | `sudo dmesg -T \| grep -i 'error\|warn\|fail\|exception'` | **Pass**: Null
**Fail**: Otherwise |
+
+### System validation
+
+Comprehensive validation ensures system stability under load. For detailed procedures, see [System Validation](../common/system-validation.md).
+
+| Test | Command | Pass/Fail criteria |
+|------|---------|-------------------|
+| [Compute/GPU properties](../common/system-validation.md#gpu-properties) | `rvs -c ${RVS_CONF}/gpup_single.conf` | **Pass**: All GCDs listed with no errors
**Fail**: Missing GCDs or errors |
+| [GPU stress test (GST)](../common/system-validation.md#gpu-stress-test) | `rvs -c ${RVS_CONF}/MI250/gst_single.conf` | **Pass**: `met: TRUE` in logs
**Fail**: Target GFLOP/s not met |
+| [Input energy delay product (IET)](../common/system-validation.md#input-energy-delay-product) | `rvs -c ${RVS_CONF}/MI250/iet_single.conf` | **Pass**: `met: TRUE` for all actions
**Fail**: Otherwise |
+| [Memory test (MEM)](../common/system-validation.md#mem) | `rvs -c ${RVS_CONF}/mem.conf -l mem.txt` | **Pass**: All tests passed; bandwidth ≥ 1050 GB/s per GCD
**Fail**: Any test failed or low bandwidth |
+| [PCIe bandwidth benchmark (PEBB)](../common/system-validation.md#pcie-bandwidth-benchmark) | `rvs -c ${RVS_CONF}/MI250/pebb_single.conf` | **Pass**: All distances and bandwidths displayed
**Fail**: Missing data |
+| [PCIe qualification tool (PEQT)](../common/system-validation.md#pcie-qualification-tool) | `rvs -c ${RVS_CONF}/peqt_single.conf` | **Pass**: All actions true
**Fail**: Otherwise |
+| [P2P benchmark and qualification tool (PBQT)](../common/system-validation.md#p2p-benchmark-and-qualification-tool) | `rvs -c ${RVS_CONF}/pbqt_single.conf` | **Pass**: `peers:true` lines and non-zero throughput across all xGMI peers
**Fail**: Otherwise |
+
+### Performance benchmarks
+
+Performance validation ensures the system meets MI250 specifications. For detailed procedures, see [Performance Benchmarking](../common/system-validation.md#performance-benchmarking).
+
+:::{card} Command: `TransferBench a2a`
+[TransferBench all-to-all](../common/system-validation.md#transferbench)
+^^^
+**Pass:** ≥ 800 GB/s aggregate
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `TransferBench p2p`
+[TransferBench peer-to-peer](../common/system-validation.md#transferbench)
+^^^
+
+| Test | Pass criteria |
+|------|--------------|
+| UniDir | ≥ 30 GB/s |
+| BiDir | ≥ 55 GB/s |
+
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `build/all_reduce_perf -b 8 -e 8G -f 2 -g 8`
+[RCCL Allreduce](../common/system-validation.md#rccl-allreduce)
+^^^
+**Pass:** ≥ 125 GB/s busbw (peak, at 8 GiB message size)
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `rocblas-bench` (see code block below)
+[rocBLAS FP32](../common/system-validation.md#rocblas-gemm-benchmarks)
+^^^
+
+```bash
+rocblas-bench -f gemm \
+ -r s -m 4000 -n 4000 -k 4000 \
+ --lda 4000 --ldb 4000 --ldc 4000 \
+ --transposeA N --transposeB T
+```
+
+**Pass:** ≥ 30 TFLOPS per GCD
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `mpiexec -n 8 wrapper.sh`
+[BabelStream](../common/system-validation.md#babelstream)
+^^^
+
+| Kernel | Threshold (MB/s) |
+|--------|-----------------|
+| Copy | ≥ 1,200,000 |
+| Mul | ≥ 1,200,000 |
+| Add | ≥ 1,100,000 |
+| Triad | ≥ 1,100,000 |
+| Dot | ≥ 1,200,000 |
+
++++
+**Fail:** otherwise
+:::
diff --git a/docs/gpus/mi300a.md b/docs/gpus/mi300a.md
new file mode 100644
index 0000000..bf24b10
--- /dev/null
+++ b/docs/gpus/mi300a.md
@@ -0,0 +1,172 @@
+---
+myst:
+ html_meta:
+ "description": "AMD Instinct MI300A acceptance criteria — prerequisites, health checks, system validation, and performance benchmarks for CDNA 3 APU platforms."
+ "keywords": "MI300A, AMD Instinct, APU, CDNA 3, ROCm, system acceptance, validation, benchmarks, HBM3"
+---
+
+# AMD Instinct MI300A
+
+The AMD Instinct™ MI300A is a data-center Accelerated Processing Unit (APU) that integrates AMD "Zen 4" CPU cores and CDNA 3 GPU compute dies on a single package with unified HBM3 memory. This document provides MI300A-specific prerequisites, health checks, validation steps, and performance acceptance criteria.
+
+## Overview
+
+The MI300A is built on the CDNA 3 architecture (gfx942) and combines CPU and GPU compute dies sharing a single coherent pool of 128 GB HBM3 per APU. Unlike discrete OAM accelerators, MI300A platforms are vendor-defined; a typical qualified configuration hosts 4 MI300A APUs per node.
+
+- **[MI300A Product Page](https://www.amd.com/en/products/accelerators/instinct/mi300/mi300a.html)**
+- **[MI300A Data Sheet](https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/data-sheets/amd-instinct-mi300a-data-sheet.pdf)**
+- **[AMD Instinct MI300 Series microarchitecture](https://instinct.docs.amd.com/latest/gpu-arch/mi300.html)**
+
+## System requirements
+
+### Operating system support
+
+For the most up-to-date information on supported operating systems and distributions, refer to the official ROCm documentation:
+
+[ROCm System Requirements - Supported Distributions](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html#supported-distributions)
+
+```{note}
+[ROCm docs](https://rocm.docs.amd.com) is the single source of truth for supported versions, distribution compatibility, and required dependencies for the ROCm toolkit.
+```
+
+For BIOS, IOMMU, transparent hugepages, NUMA, and OS-level tuning that applies to all AMD Instinct hosts, see [BIOS settings](../common/bios-settings.md), [OS tuning](../common/os-tuning.md), and [Kernel parameters](../common/kernel-parameters.md). MI300A requires a Linux kernel that supports "Zen 4" (≥ 5.18 recommended).
+
+### GPU identification
+
+All MI300A APUs (PCI vendor:device `1002:74a0`) should appear in `lspci` output:
+
+```bash
+sudo lspci -d 1002:74a0
+```
+
+Expected output example:
+
+```bash
+0000:01:00.0 Processing accelerators: Advanced Micro Devices, Inc. [AMD/ATI] Aqua Vanjaram [Instinct MI300A]
+0001:01:00.0 Processing accelerators: Advanced Micro Devices, Inc. [AMD/ATI] Aqua Vanjaram [Instinct MI300A]
+0002:01:00.0 Processing accelerators: Advanced Micro Devices, Inc. [AMD/ATI] Aqua Vanjaram [Instinct MI300A]
+0003:01:00.0 Processing accelerators: Advanced Micro Devices, Inc. [AMD/ATI] Aqua Vanjaram [Instinct MI300A]
+```
+
+## Acceptance criteria
+
+The MI300A system acceptance process validates that the platform is correctly configured, stable, and performing to expectations. Follow the sequence: Prerequisites → Basic Health Checks → System Validation → Performance Benchmarks.
+
+### System acceptance process
+
+1. **[Prerequisites validation](#prerequisites-validation)** - Ensure all system requirements and dependencies are met
+2. **[Basic health checks](#basic-health-checks)** - Verify hardware detection and basic system health
+3. **[System validation](#system-validation)** - Conduct comprehensive stress testing and qualification
+4. **[Performance benchmarks](#performance-benchmarks)** - Validate compute, memory, and interconnect performance
+
+The system is accepted when all criteria below are successfully validated.
+
+### Prerequisites validation
+
+Ensure all system requirements are met before proceeding with validation. See the [Prerequisites documentation](../common/prerequisites.md) and [System setup](../common/system-setup.md) for more details.
+
+- ✅ Supported operating system version installed with kernel ≥ 5.18 (Zen 4 support)
+- ✅ Compatible ROCm version installed (verify: `cat /opt/rocm/.info/version`); see the [ROCm System Requirements](https://rocm.docs.amd.com/projects/install-on-linux/en/latest/reference/system-requirements.html) for the current supported version matrix
+- ✅ BIOS configured per [BIOS settings](../common/bios-settings.md), with MI300A-specific values per platform vendor (IOMMU off, memory interleaving, NPS)
+- ✅ Required kernel parameters present: `pci=realloc=off transparent_hugepage=always numa_balancing=disable`
+- ✅ Sysctl tunings applied: `vm.compaction_proactiveness=20`, `vm.max_map_count` increased per ROCm guide
+- ✅ Environment variables (where applicable):
+ - `HSA_OVERRIDE_CPU_AFFINITY_DEBUG=0`
+ - `GPU_MAX_ALLOC_PERCENT` and `GPU_SINGLE_ALLOC_PERCENT` tuned per workload
+- ✅ Minimum 4 × 128 GB = 512 GB unified HBM3 visible to the OS usable host-visible memory (note: MI300A's HBM is unified with CPU)
+- ✅ Latest applicable firmware applied consistently across nodes
+- ✅ ROCm Validation Suite (RVS) installed
+
+### Basic health checks
+
+These checks ensure fundamental system health and proper APU detection. For detailed procedures, see [Health Checks](../common/health-checks.md).
+
+| Test | Command | Pass/Fail criteria |
+|------|---------|-------------------|
+| [Check OS distribution](../common/health-checks.md#check-os-distribution) | `cat /etc/os-release` | **Pass**: OS version listed in compatibility matrix
**Fail**: Otherwise |
+| [Check kernel boot arguments](../common/health-checks.md#check-kernel-boot-arguments) | `cat /proc/cmdline` | **Pass**: Contains `pci=realloc=off transparent_hugepage=always numa_balancing=disable`
**Fail**: Missing any required param |
+| [Check for driver errors](../common/health-checks.md#check-for-driver-errors) | `sudo dmesg -T \| grep amdgpu \| grep -i error` | **Pass**: Null
**Fail**: Errors reported |
+| [Check available memory](../common/health-checks.md#check-for-available-system-memory) | `lsmem \| grep "Total online memory"` | **Pass**: ≥ 4 × 128 GB = 512 GB unified HBM3 visible to the OS
**Fail**: Less than 4 × 128 GB = 512 GB unified HBM3 visible to the OS |
+| [Check GPU presence](../common/health-checks.md#check-gpu-presence) | `sudo lspci -d 1002:74a0` | **Pass**: 4 MI300A APUs found
**Fail**: Otherwise |
+| [Check GPU link speed and width](../common/health-checks.md#check-gpu-pcie-bus-link-speed-and-width) | `sudo lspci -d 1002:74a0 -vvv \| grep -e DevSta -e LnkSta` | **Pass**: Speed PCIe Gen 5 (32 GT/s), width `x16`, no `FatalErr+`
**Fail**: Otherwise |
+| [Monitor utilization metrics](../common/health-checks.md#monitor-utilization-metrics) | `amd-smi monitor -putm` | **Pass**: Idle metrics as specified
**Fail**: Otherwise |
+| [Check system kernel logs for errors](../common/health-checks.md#check-system-kernel-logs) | `sudo dmesg -T \| grep -i 'error\|warn\|fail\|exception'` | **Pass**: Null
**Fail**: Otherwise |
+
+### System validation
+
+Comprehensive validation ensures system stability under load. For detailed procedures, see [System Validation](../common/system-validation.md).
+
+| Test | Command | Pass/Fail criteria |
+|------|---------|-------------------|
+| [Compute/GPU properties](../common/system-validation.md#gpu-properties) | `rvs -c ${RVS_CONF}/gpup_single.conf` | **Pass**: All APUs listed with no errors
**Fail**: Missing APUs or errors |
+| [GPU stress test (GST)](../common/system-validation.md#gpu-stress-test) | `rvs -c ${RVS_CONF}/MI300A/gst_single.conf` | **Pass**: `met: TRUE` in logs
**Fail**: Target GFLOP/s not met |
+| [Input energy delay product (IET)](../common/system-validation.md#input-energy-delay-product) | `rvs -c ${RVS_CONF}/MI300A/iet_single.conf` | **Pass**: `met: TRUE` for all actions
**Fail**: Otherwise |
+| [Memory test (MEM)](../common/system-validation.md#mem) | `rvs -c ${RVS_CONF}/mem.conf -l mem.txt` | **Pass**: All tests passed; bandwidth ≥ 2.0 TB/s per APU
**Fail**: Any test failed or low bandwidth |
+| [PCIe bandwidth benchmark (PEBB)](../common/system-validation.md#pcie-bandwidth-benchmark) | `rvs -c ${RVS_CONF}/MI300A/pebb_single.conf` | **Pass**: All distances and bandwidths displayed
**Fail**: Missing data |
+| [PCIe qualification tool (PEQT)](../common/system-validation.md#pcie-qualification-tool) | `rvs -c ${RVS_CONF}/peqt_single.conf` | **Pass**: All actions true
**Fail**: Otherwise |
+| [P2P benchmark and qualification tool (PBQT)](../common/system-validation.md#p2p-benchmark-and-qualification-tool) | `rvs -c ${RVS_CONF}/pbqt_single.conf` | **Pass**: `peers:true` lines and non-zero throughput across all xGMI peers
**Fail**: Otherwise |
+
+### Performance benchmarks
+
+Performance validation ensures the system meets MI300A specifications. For detailed procedures, see [Performance Benchmarking](../common/system-validation.md#performance-benchmarking).
+
+:::{card} Command: `TransferBench a2a`
+[TransferBench all-to-all](../common/system-validation.md#transferbench)
+^^^
+**Pass:** ≥ 700 GB/s aggregate
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `TransferBench p2p`
+[TransferBench peer-to-peer](../common/system-validation.md#transferbench)
+^^^
+
+| Test | Pass criteria |
+|------|--------------|
+| UniDir | ≥ 80 GB/s |
+| BiDir | ≥ 155 GB/s |
+
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `build/all_reduce_perf -b 8 -e 8G -f 2 -g 4`
+[RCCL Allreduce](../common/system-validation.md#rccl-allreduce)
+^^^
+**Pass:** ≥ 230 GB/s busbw (peak, at 8 GiB message size)
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `rocblas-bench` (see code block below)
+[rocBLAS FP32](../common/system-validation.md#rocblas-gemm-benchmarks)
+^^^
+
+```bash
+rocblas-bench -f gemm \
+ -r s -m 4000 -n 4000 -k 4000 \
+ --lda 4000 --ldb 4000 --ldc 4000 \
+ --transposeA N --transposeB T
+```
+
+**Pass:** ≥ 60 TFLOPS per APU
++++
+**Fail:** otherwise
+:::
+
+:::{card} Command: `mpiexec -n 4 wrapper.sh`
+[BabelStream](../common/system-validation.md#babelstream)
+^^^
+
+| Kernel | Threshold (MB/s) |
+|--------|-----------------|
+| Copy | ≥ 2,900,000 |
+| Mul | ≥ 3,000,000 |
+| Add | ≥ 3,250,000 |
+| Triad | ≥ 3,250,000 |
+| Dot | ≥ 2,200,000 |
+
++++
+**Fail:** otherwise
+:::
diff --git a/docs/index.md b/docs/index.md
index 1915708..4ef6467 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -138,6 +138,10 @@ Start by selecting the page for the specific GPU accelerator you are validating.
- **[AMD Instinct MI350X](gpus/mi350x.md)**
- **[AMD Instinct MI325X](gpus/mi325x.md)**
- **[AMD Instinct MI300X](gpus/mi300x.md)**
+- **[AMD Instinct MI300A](gpus/mi300a.md)**
+- **[AMD Instinct MI250 / MI250X](gpus/mi250.md)**
+- **[AMD Instinct MI210](gpus/mi210.md)**
+- **[AMD Instinct MI100](gpus/mi100.md)**
Follow the GPU page end‑to‑end; it will walk you through verifying system prerequisites, running health checks, executing validation suites and microbenchmarks, and applying acceptance criteria thresholds.
diff --git a/docs/sphinx/_toc.yml.in b/docs/sphinx/_toc.yml.in
index 3eb17cd..6dff208 100644
--- a/docs/sphinx/_toc.yml.in
+++ b/docs/sphinx/_toc.yml.in
@@ -7,7 +7,11 @@ subtrees:
- file: gpus/mi350x.md
- file: gpus/mi325x.md
- file: gpus/mi300x.md
-
+ - file: gpus/mi300a.md
+ - file: gpus/mi250.md
+ - file: gpus/mi210.md
+ - file: gpus/mi100.md
+
- caption: System Configuration
entries:
- file: common/prerequisites.md