Skip to content

[Feature] Base/Row alignment support for Tensors#136

Open
zacharyvincze wants to merge 56 commits intoROCm:developfrom
zacharyvincze:zv/feature/add-memalign-class
Open

[Feature] Base/Row alignment support for Tensors#136
zacharyvincze wants to merge 56 commits intoROCm:developfrom
zacharyvincze:zv/feature/add-memalign-class

Conversation

@zacharyvincze
Copy link
Copy Markdown
Contributor

PR Description

General

  • Adds padding (row/base pointer alignment) support for roccv::Tensor.
  • Adds MemAlign class which is used as a parameter when allocating memory for Tensors. This specifies what the row alignment and base alignment should be for newly created tensors.

Samples

  • Changes support reading/writing images from host <-> device with tensor padding in mind.
  • Improves help messages and command line argument parsing across all samples.

Tensors

  • Stride calculation logic changed to take non-contiguous data (row padding) into account.
  • Reshape method implementation changed to take into account non-contiguous tensors. Ensures that the shape cannot be changed across the row padding boundary, as this is supposed to be a zero-copy operation which retains the initial memory layout.
  • Quality of life methods for checking whether a tensor is contiguous, and for returning the amount of non-padded (non-garbage) bytes being used by the tensor.
  • Adds copyToHost and copyFromHost methods to the tensor, which accepts and produces contiguous host buffers. Since padding makes copying memory from the host to the device non-trivial, this allows the user to perform a copy which is agnostic to the underlying row padding that a tensor may have.

Python bindings (rocpycv)

  • Adds correctness changes to DLPack conversion. Checks whether strides are provided by DLPack first, if so, uses them directly after converting from element-wise to byte-wise strides, otherwise, assumes a contiguous tensor is being provided by DLPack and recalculates with that assumption.

Tests

  • Helpers which copy from host to device now use Tensor::copyFromHost, Tensor::copyToHost directly instead of using duplicate code.
  • Adds Tensor unit tests to ensure padding is calculated properly, and that any additional methods added to roccv::Tensor are covered. Special attention is paid to the reshape method, since this has undergone the most changes.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds base-pointer and per-row alignment (padding) support to roccv::Tensor, updating stride/reshape semantics and introducing host<->device copy helpers that are padding-aware. It also updates samples, Python bindings, benchmarks, and tests to operate correctly with non-contiguous (row-padded) tensors.

Changes:

  • Introduce MemAlignment and update Tensor allocation/stride/reshape logic to support row/base alignment and non-contiguous layouts.
  • Add Tensor::copyFromHost / Tensor::copyToHost and refactor tests/helpers/samples to use padding-aware copies.
  • Update Python DLPack import to honor provided strides (and convert element-strides to byte-strides), with a contiguous fallback.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/core/tensor.cpp Core implementation updates: alignment-aware stride calc, reshape for non-contiguous tensors, host copy helpers.
include/core/tensor.hpp Public API updates for alignment-aware construction + new copy/contiguity helpers.
include/core/mem_alignment.hpp / src/core/mem_alignment.cpp New alignment configuration type used during tensor allocation.
include/core/utils.hpp New low-level helpers (power-of-two + align-up) used for alignment/stride logic.
src/core/tensor_storage.cpp / include/core/tensor_storage.hpp Expose allocator used by TensorStorage.
src/op_non_max_suppression.cpp Update reshape call sites to new reshape(dtype, shape) signature.
python/src/py_tensor.cpp / python/include/py_tensor.hpp DLPack import correctness fixes + documentation/typing cleanup.
tests/roccv/cpp/src/tests/core/tensor/test_tensor.cpp Add/adjust tensor tests for padding/reshape/copy helpers.
tests/roccv/cpp/include/test_helpers.hpp Refactor host<->device copy helpers to use new tensor copy APIs.
samples/common/utils.hpp Add batch image load/write utilities that respect tensor padding.
samples/*.cpp (multiple) Update CLI parsing + use new image IO helpers (padding-aware).
benchmarks/src/roccv/roccv_bench_helpers.cpp Update benchmark filling logic to account for padded allocation sizes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread samples/common/utils.hpp
Comment thread include/core/tensor.hpp Outdated
Comment thread samples/bilateral_filter.cpp Outdated
Comment thread samples/composite.cpp
Comment thread samples/cropandresize/cpp/main.cpp Outdated
Comment thread samples/common/utils.hpp Outdated
Comment thread samples/common/utils.hpp Outdated
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 23, 2026

Codecov Report

❌ Patch coverage is 77.87234% with 52 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/core/tensor.cpp 76.47% 33 Missing and 15 partials ⚠️
include/core/utils.hpp 78.57% 3 Missing ⚠️
src/core/tensor_storage.cpp 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #136      +/-   ##
===========================================
- Coverage    74.40%   74.37%   -0.03%     
===========================================
  Files           79       82       +3     
  Lines         3355     3546     +191     
  Branches       738      776      +38     
===========================================
+ Hits          2496     2637     +141     
- Misses         378      414      +36     
- Partials       481      495      +14     
Files with missing lines Coverage Δ
include/core/mem_alignment.hpp 100.00% <100.00%> (ø)
include/core/tensor.hpp 62.50% <ø> (ø)
src/core/mem_alignment.cpp 100.00% <100.00%> (ø)
src/op_non_max_suppression.cpp 41.18% <100.00%> (ø)
src/core/tensor_storage.cpp 81.48% <0.00%> (-3.13%) ⬇️
include/core/utils.hpp 78.57% <78.57%> (ø)
src/core/tensor.cpp 72.53% <76.47%> (+1.23%) ⬆️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread benchmarks/src/roccv/roccv_bench_helpers.cpp Outdated
Comment thread include/core/mem_alignment.hpp Outdated
Comment thread include/core/tensor.hpp Outdated
Comment thread include/core/tensor.hpp Outdated
Comment thread src/core/tensor.cpp
Comment thread src/core/tensor.cpp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:precheckin enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants