[Optimization]: Reduce branching when possible in casting.hpp by zacharyvincze · Pull Request #117 · ROCm/rocCV

zacharyvincze · 2026-02-06T20:13:00Z

Details

Removes branching where possible to the casting helper functions seen in casting.hpp. Aims to reduce divergence on GPU kernel implementations.
Includes fixes to some float -> integer saturation casts, especially for 32/64-bit integer cases that are not represented exactly as 32-bit floats.

Copilot

Pull request overview

This PR updates the core casting helpers to reduce branching (especially for GPU code paths) and adjusts saturation behavior for some float→integer conversions, alongside adding a small test and extending supported type traits.

Changes:

Refactors ScalarSaturateCast / ScalarRangeCast logic in casting.hpp to use more branchless/min-max based clamping and special-case small integer widths.
Extends type traits support to include long/ulong vectorized types.
Adds a new C++ test covering basic SaturateCast behavior and a few limit/vector cases.
Adjusts the GPU block dimensions for the Composite operator kernel launch.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File	Description
`include/core/detail/casting.hpp`	Refactors saturate/range cast implementations to reduce branching and adjust clamping/rounding logic.
`include/core/detail/type_traits.hpp`	Adds `long` / `ulong` to the type-traits macro set.
`tests/roccv/cpp/src/tests/core/detail/test_saturate_cast.cpp`	Introduces a basic unit test for `SaturateCast`, including a couple of vectorized casts.
`src/op_composite.cpp`	Changes GPU kernel launch block dimensions for the composite operator.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov-commenter · 2026-02-06T20:45:49Z

Codecov Report

❌ Patch coverage is 54.54545% with 15 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
include/core/detail/casting.hpp	54.55%	12 Missing and 3 partials ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #117      +/-   ##
===========================================
- Coverage    74.40%   74.38%   -0.02%     
===========================================
  Files           79       79              
  Lines         3355     3368      +13     
  Branches       738      733       -5     
===========================================
+ Hits          2496     2505       +9     
- Misses         378      379       +1     
- Partials       481      484       +3

Files with missing lines	Coverage Δ
include/core/detail/type_traits.hpp	`87.50% <ø> (ø)`
include/core/detail/casting.hpp	`78.26% <54.55%> (+2.31%)`	⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

simonCatBot · 2026-03-20T21:25:16Z

Review: [Optimization] Reduce branching in casting.hpp

Kernel optimization focusing on GPU divergence:

Changes:

Branch reduction in casting helper functions
Fixes for float->integer saturation casts (32/64-bit cases)
4 files changed, +129/-33 lines

Assessment: Needs Review - Performance optimization.

Reducing branching in GPU kernels is always good for warp efficiency. The fixes for 32/64-bit integer saturation casts sound important - precision issues in type conversion can be subtle bugs.

Would benefit from:

Performance benchmarks showing divergence reduction
Verification that precision is maintained for edge cases
Review of the saturation logic changes

Solid optimization PR.

…ding helper

zacharyvincze · 2026-04-28T01:57:19Z

Fixed some issues with assuming that certain types were floats. Now making sure to use the proper function version when computing with doubles to maintain precision.

Added some more tests as well to catch more edge cases for Range/Saturate/Static casting.

zacharyvincze added 3 commits January 30, 2026 10:31

Avoid branching in casting implementations

4232bcd

Add more tests for Saturate cast

77cabc7

Fix issues with float -> integer saturate casts

d887102

zacharyvincze requested review from Copilot, jeffqjiangNew and paveltc February 6, 2026 20:13

zacharyvincze self-assigned this Feb 6, 2026

zacharyvincze added enhancement New feature or request ci:precheckin labels Feb 6, 2026

Copilot started reviewing on behalf of zacharyvincze February 6, 2026 20:13 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

Comment thread include/core/detail/casting.hpp

Comment thread include/core/detail/casting.hpp Outdated

Comment thread include/core/detail/casting.hpp Outdated

Comment thread tests/roccv/cpp/src/tests/core/detail/test_saturate_cast.cpp

Comment thread src/op_composite.cpp

Undo changes to composite

146a1f9

zacharyvincze added 11 commits February 6, 2026 16:18

Review fixes

e9e9f0b

Add another test case for RangeCast

13a78be

Merge branch 'develop' into zv/optimization/optimize-casting-performance

f1b1571

Merge branch 'develop' into zv/optimization/optimize-casting-performance

146c95f

Merge branch 'develop' into zv/optimization/optimize-casting-performance

1cbedda

Merge branch 'develop' into zv/optimization/optimize-casting-performance

f97231c

Merge branch 'develop' into zv/optimization/optimize-casting-performance

e712805

Merge branch 'develop' into zv/optimization/optimize-casting-performance

9ae56a5

Merge branch 'develop' into zv/optimization/optimize-casting-performance

12d355b

Merge branch 'develop' into zv/optimization/optimize-casting-performance

a883238

Merge branch 'develop' into zv/optimization/optimize-casting-performance

2fcaf2f

zacharyvincze added 5 commits April 7, 2026 14:02

Merge branch 'develop' into zv/optimization/optimize-casting-performance

a88cf40

Merge branch 'develop' into zv/optimization/optimize-casting-performance

6be0319

Merge branch 'develop' into zv/optimization/optimize-casting-performance

70a9ef5

Merge branch 'develop' into zv/optimization/optimize-casting-performance

0916ad2

Merge branch 'develop' into zv/optimization/optimize-casting-performance

aa74059

zacharyvincze added 6 commits April 23, 2026 14:49

Merge branch 'develop' into zv/optimization/optimize-casting-performance

ae498e7

Merge branch 'develop' into zv/optimization/optimize-casting-performance

ab56bf8

Fix double precision issue in scalar range casting + add unified roun…

883a1f0

…ding helper

Improve vector saturate/range/static cast constexpr branching

bbeefcf

Add additional tests for casting helpers + introduce static cast tests

0b10ce3

Use fmed3f for floating point clamping

3d54d7d

Merge branch 'develop' into zv/optimization/optimize-casting-performance

d6fabe0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Optimization]: Reduce branching when possible in casting.hpp#117

[Optimization]: Reduce branching when possible in casting.hpp#117
zacharyvincze wants to merge 27 commits intoROCm:developfrom
zacharyvincze:zv/optimization/optimize-casting-performance

zacharyvincze commented Feb 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Feb 6, 2026 •

edited

Loading

Uh oh!

simonCatBot commented Mar 20, 2026

Uh oh!

zacharyvincze commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

zacharyvincze commented Feb 6, 2026

Details

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

simonCatBot commented Mar 20, 2026

Review: [Optimization] Reduce branching in casting.hpp

Uh oh!

zacharyvincze commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Feb 6, 2026 •

edited

Loading