[Feature] Base/Row alignment support for Tensors#136
Open
zacharyvincze wants to merge 56 commits intoROCm:developfrom
Open
[Feature] Base/Row alignment support for Tensors#136zacharyvincze wants to merge 56 commits intoROCm:developfrom
zacharyvincze wants to merge 56 commits intoROCm:developfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds base-pointer and per-row alignment (padding) support to roccv::Tensor, updating stride/reshape semantics and introducing host<->device copy helpers that are padding-aware. It also updates samples, Python bindings, benchmarks, and tests to operate correctly with non-contiguous (row-padded) tensors.
Changes:
- Introduce
MemAlignmentand updateTensorallocation/stride/reshape logic to support row/base alignment and non-contiguous layouts. - Add
Tensor::copyFromHost/Tensor::copyToHostand refactor tests/helpers/samples to use padding-aware copies. - Update Python DLPack import to honor provided strides (and convert element-strides to byte-strides), with a contiguous fallback.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/core/tensor.cpp |
Core implementation updates: alignment-aware stride calc, reshape for non-contiguous tensors, host copy helpers. |
include/core/tensor.hpp |
Public API updates for alignment-aware construction + new copy/contiguity helpers. |
include/core/mem_alignment.hpp / src/core/mem_alignment.cpp |
New alignment configuration type used during tensor allocation. |
include/core/utils.hpp |
New low-level helpers (power-of-two + align-up) used for alignment/stride logic. |
src/core/tensor_storage.cpp / include/core/tensor_storage.hpp |
Expose allocator used by TensorStorage. |
src/op_non_max_suppression.cpp |
Update reshape call sites to new reshape(dtype, shape) signature. |
python/src/py_tensor.cpp / python/include/py_tensor.hpp |
DLPack import correctness fixes + documentation/typing cleanup. |
tests/roccv/cpp/src/tests/core/tensor/test_tensor.cpp |
Add/adjust tensor tests for padding/reshape/copy helpers. |
tests/roccv/cpp/include/test_helpers.hpp |
Refactor host<->device copy helpers to use new tensor copy APIs. |
samples/common/utils.hpp |
Add batch image load/write utilities that respect tensor padding. |
samples/*.cpp (multiple) |
Update CLI parsing + use new image IO helpers (padding-aware). |
benchmarks/src/roccv/roccv_bench_helpers.cpp |
Update benchmark filling logic to account for padded allocation sizes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #136 +/- ##
===========================================
- Coverage 74.40% 74.37% -0.03%
===========================================
Files 79 82 +3
Lines 3355 3546 +191
Branches 738 776 +38
===========================================
+ Hits 2496 2637 +141
- Misses 378 414 +36
- Partials 481 495 +14
🚀 New features to boost your workflow:
|
…cze/rocCV into zv/feature/add-memalign-class
jeffqjiangNew
approved these changes
Apr 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR Description
General
roccv::Tensor.MemAlignclass which is used as a parameter when allocating memory for Tensors. This specifies what the row alignment and base alignment should be for newly created tensors.Samples
Tensors
copyToHostandcopyFromHostmethods to the tensor, which accepts and produces contiguous host buffers. Since padding makes copying memory from the host to the device non-trivial, this allows the user to perform a copy which is agnostic to the underlying row padding that a tensor may have.Python bindings (rocpycv)
Tests
Tensor::copyFromHost,Tensor::copyToHostdirectly instead of using duplicate code.roccv::Tensorare covered. Special attention is paid to the reshape method, since this has undergone the most changes.