Skip to content

feat(bench/compare): add fio xnvme compare workflow#64

Open
yonggilsong wants to merge 5 commits into
xnvme:mainfrom
yonggilsong:priv/yg/comp_bench
Open

feat(bench/compare): add fio xnvme compare workflow#64
yonggilsong wants to merge 5 commits into
xnvme:mainfrom
yonggilsong:priv/yg/comp_bench

Conversation

@yonggilsong
Copy link
Copy Markdown
Collaborator

Summary

Add an AiSIO compare benchmark workflow for fio + xNVMe so we can compare
the spdk and upcie backends under the same benchmark matrix.

This series adds:

  • tasks/bench_fio_compare.yaml
  • fio_xnvme support in the benchmark path
  • raw result collection into a single compare JSON
  • compare HTML visualization
  • validation and smoke coverage
  • safer CPU control handling when cpupower, turbo sysfs, or CPU frequency
    sysfs entries are unavailable
  • quoted PCI_BLACKLIST handling for multi-BDF device configs

Usage

Setup order:

cijoe --monitor -c configs/transport.toml tasks/setup_ubuntu.yaml
cijoe --monitor -c configs/transport.toml -c configs/udmabuf_import.toml tasks/setup_udmabuf_import.yaml
cijoe --monitor -c configs/transport.toml -c configs/nvstack.toml tasks/setup_nvstack.yaml
cijoe --monitor -c configs/transport.toml -c configs/aisio.toml tasks/setup_aisio.yaml

Run the compare benchmark:

cijoe --monitor \
  -c configs/transport.toml \
  -c configs/bench_fio_compare.toml \
  -c configs/devices_*.toml \
  tasks/bench_fio_compare.yaml

Results are written under cijoe-output/artifacts/fio-compare/<bdf>/.

Current defaults are runtime=5s and fio_size=16GiB.

TODO

  • Add a standalone fio_xnvme script with add_args() / main()
  • Reduce duplication in bench_fio_compare.yaml
  • Revisit platform-specific backend/device binding behavior
  • Improve CPU frequency support detection on systems without standard sysfs exposure

Allow benchmark runs to continue when turbo sysfs controls or
cpupower are unavailable on the target system.

Signed-off-by: Yonggil Song <yonggil.song@samsung.com>
Copy link
Copy Markdown
Contributor

@karlowich karlowich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot take in here, but it looks very promising!
So far, I have one comment :)
I will do another pass later.

Comment thread tasks/bench_fio_compare.yaml Outdated
Comment on lines +67 to +72
- name: prepare_read_upcie
run: |
{{ config.xnvme.driver.prefix }} xnvme-driver reset || true
bash -lc 'ns_path=$(find /sys/bus/pci/devices/{{ config.devices[0].pci_addr }}/nvme -mindepth 2 -maxdepth 2 -type d -name "nvme*n*" 2>/dev/null | head -n 1); if [ -n "$ns_path" ]; then blkdiscard -f "/dev/$(basename "$ns_path")" || true; fi'
modprobe uio_pci_generic
DRIVER_OVERRIDE=uio_pci_generic {{ config.xnvme.driver.prefix }} xnvme-driver
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason not to do all the SPDK benchmarks first and then all the uPCIe? This way you only need the "prepare" steps once per backend

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the backend-first structure is cleaner and would avoid repeated prepare steps. I'll rework the workflow so all SPDK runs happen together and all uPCIe runs happen together, unless I hit an ordering dependency while moving the read-prefill flow out of the helper.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I separated trim from prepare in the updated workflow. That makes the device-state transition explicit, and it also lets write and randwrite start from a freshly trimmed state instead of inheriting the state left by earlier workloads.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest taking the trim code out of the workflow file and creating a separate script for it. That way we can simplify the workflow steps visually.
Likewise, we can create a separate script for the pre-fill step.
I want to move towards reducing the complexity of bench_helper and bench_runall.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I’ll move the trim and pre-fill logic out of the workflow.

Do you have a preference between adding separate action modules like fio_xnvme_trim.py / fio_xnvme_prefill.py, or keeping trim, pre-fill, and fio command construction together in fio_xnvme.py?

The separate modules would make the workflow steps explicit and easier to extend independently, while keeping them in fio_xnvme.py would keep the compare-benchmark-specific fio/xNVMe logic in one place.

@NaddiNadja
Copy link
Copy Markdown
Contributor

In regards to the first commit: Why is it necessary to account for cpupower and turbo not being available on the platform? If the machine has been provisioned following the steps and tasks in the AiSIO repo, these should be available, no?

Copy link
Copy Markdown
Contributor

@NaddiNadja NaddiNadja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple questions / suggestions. Haven't looked through it all yet, but wanted to give this part of the feedback before going on weekend :)

Comment thread scripts/bench_helper.py Outdated
Comment thread scripts/bench_helper.py Outdated
Comment thread scripts/bench_helper.py Outdated
Comment thread scripts/fio_xnvme.py Outdated
Comment thread scripts/fio_xnvme.py Outdated
Comment thread scripts/bench_helper.py Outdated
Comment thread scripts/bench_helper.py
@yonggilsong
Copy link
Copy Markdown
Collaborator Author

In regards to the first commit: Why is it necessary to account for cpupower and turbo not being available on the platform? If the machine has been provisioned following the steps and tasks in the AiSIO repo, these should be available, no?

In the ideal setup, yes, these controls should be available after following the AiSIO provisioning flow. The issue I ran into was that this assumption does not always hold in practice on all target machines. In particular, custom dmabuf kernels do not necessarily have a matching linux-tools package available for cpupower, and some platforms still do not expose the expected CPU frequency/turbo control interfaces even after provisioning. I added this handling so the benchmark reports that CPU control was unavailable instead of aborting mid-run or silently misrepresenting the environment.

Add a dedicated fio+xNVMe compare workflow for SPDK and uPCIe.

Run read, randread, write, and randwrite in one workflow. Store
raw results under a BDF-specific artifact root, collect them into
a single benchmark-results.json, and render a compare-specific
HTML report with the benchmark-fio-compare template.

Signed-off-by: Yonggil Song <yonggil.song@samsung.com>
Signed-off-by: Yonggil Song <yonggil.song@samsung.com>
…sysfs

Skip CPU frequency logging cleanly when the target platform does
not expose the sysfs files the logger expects. This lets
benchmark runs continue without aborting on empty frequency
samples.

Also make the dmabuf kernel setup explicitly enable the
cpufreq-related kernel options needed for scaling_cur_freq
reporting on this platform.

Signed-off-by: Yonggil Song <yonggil.song@samsung.com>
Quote the PCI_BLACKLIST value in device configs so xnvme-driver
prefixes survive shell parsing when multiple BDFs are listed.

Without the quotes, a space-separated blacklist is split into
separate shell words before xnvme-driver sees it.

Signed-off-by: Yonggil Song <yonggil.song@samsung.com>
@karlowich
Copy link
Copy Markdown
Contributor

@yonggilsong

In particular, custom dmabuf kernels do not necessarily have a matching linux-tools package available for cpupower

This has been addressed, now we install cpupower alongside the custom kernel.
Can do we still need the CPU-freq skip? It would be a nice simplification if we don't need it :)

@yonggilsong
Copy link
Copy Markdown
Collaborator Author

@yonggilsong

In particular, custom dmabuf kernels do not necessarily have a matching linux-tools package available for cpupower

This has been addressed, now we install cpupower alongside the custom kernel. Can do we still need the CPU-freq skip? It would be a nice simplification if we don't need it :)

Thanks for the update. Could you point me to the branch or PR that addresses installing cpupower alongside the custom kernel?

I’d like to test that setup first and confirm CPU frequency control works with the custom dmabuf kernel. Once that’s verified, I’ll remove the CPU-freq skip/fallback path as a follow-up simplification.

@karlowich
Copy link
Copy Markdown
Contributor

@yonggilsong, sorry for the slow reply. It is here: 8dcf598

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants