-
Notifications
You must be signed in to change notification settings - Fork 74
Size-1 dimension can have an arbitrary stride. #5787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
!test |
Description
|
| Relevant files | |||
|---|---|---|---|
| Enhancement |
| ||
| Tests |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| 🔒 No security concerns identified |
| ⚡ Recommended focus areas for review |
Validation Logic
expected_stride_if_contiguous is also now conditional on size > 1. This means when size=1, the expected stride doesn't get updated, which could potentially cause issues in subsequent iterations. The comment suggests this is intentional, but it should be verified that this doesn't break the stride calculation for later dimensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Changes Overview
This PR allows size-1 dimensions to have arbitrary stride values by modifying the stride validation logic in validateAllocationSizesAndStrides().
Key Changes:
-
Core Logic Update (
csrc/tensor_metadata.cpp):- Modified stride validation to skip checking when
size == 1 - This is mathematically sound since accessing a dimension with size 1 only accesses index 0, making the stride irrelevant
- When
size == 1, theexpected_stride_if_contiguousis not updated, allowing it to be correctly propagated to outer dimensions
- Modified stride validation to skip checking when
-
Loop Refactoring:
- Converted traditional C-style for loop to modern C++20 range-based loop using
arange()withstd::views::reverse - Functionally equivalent but more idiomatic
- Converted traditional C-style for loop to modern C++20 range-based loop using
-
Test Coverage (
tests/python/direct/test_allocation.py):- New test case
test_contiguous()validates the fix with a realistic scenario - Creates a tensor with shape [1, 1] and strides [2, 1] using
torch.as_strided() - This represents a valid contiguous tensor that would have failed validation under the old logic
- New test case
Why This Change Is Correct:
When a dimension has size 1, its stride value doesn't affect memory access patterns since only element at index 0 is ever accessed. The old code incorrectly validated that these dimensions matched expected contiguous strides. The new code properly recognizes that size-1 dimensions can have any stride while still being part of a valid contiguous layout.
Confidence Score: 5/5
- This PR is safe to merge with high confidence. The core logic change is mathematically sound and well-motivated.
- Score reflects that the core logical changes are correct and well-reasoned. The stride validation modification properly accounts for the fact that size-1 dimensions' strides don't affect memory access patterns. The loop refactoring to modern C++20 ranges is a direct equivalent. The only minor issue is a copyright year inconsistency in the test file (2026 vs 2025), which doesn't affect functionality. Comprehensive testing is provided with a realistic test case demonstrating the fix works for the intended use case (contiguous tensor with size-1 dimension having arbitrary stride).
- No files require special attention. Minor style improvement recommended: update copyright year in test file from 2026 to 2025 for consistency.
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| csrc/tensor_metadata.cpp | 5/5 | Core logic change to allow size-1 dimensions to have arbitrary strides. The validation now skips stride checks for dimensions where size==1, as their stride values don't affect memory access patterns. Loop refactored to use C++20 ranges. Logic is sound and mathematically correct. |
| tests/python/direct/test_allocation.py | 4/5 | New test case for validating size-1 dimensions with arbitrary strides. Tests realistic scenario of contiguous tensor with size-1 dimension that has non-unit stride. Minor issue: copyright year should be 2025 for consistency with other test files. |
Sequence Diagram
sequenceDiagram
actor User as Test Case
participant Val as validateAllocationSizesAndStrides
participant Check as Stride Validation
User->>Val: Call with sizes=[1,1], strides=[2,1], contiguity=[true,true]
Note over Val: Process dim 1 (rightmost)
Val->>Check: size=1, stride=1
Check->>Check: size > 1? NO → skip stride check
Check->>Check: keep expected_stride=1
Check-->>Val: PASS (no check performed)
Note over Val: Process dim 0
Val->>Check: size=1, stride=2
Check->>Check: size > 1? NO → skip stride check
Check->>Check: keep expected_stride=1
Check-->>Val: PASS (no check performed)
Note over Val: All dimensions processed
Val-->>User: Validation succeeds (would fail with old logic)
jjsjann123
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with relaxing the checks.
Just wanted to point out that previously the decision of not ignoring stride for size-1 dimension was made intentionally.
Back in the days when convolution was important, we rely on stride of convolution weights in order to determine the layout of convolution. In which case it's often necessary to use the stride of size-1 to interpret user intended stride order for propagation. So naturally we choose to guard on that as well.
|
|
||
| int64_t expected_stride_if_contiguous = 1; | ||
| auto dim_index = static_cast<int64_t>(sizes.size()); | ||
| auto dim_index = std::ssize(sizes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏
No description provided.