Skip to content

merge fixes for sret AS addrspacecast issue into rocm-7.2.0.1#2490

Open
VigneshwarJ wants to merge 4 commits into
release/rocm-rel-7.2.0.1from
amd/dev/vjayakum/rocm-rel-7.2.0.1/sret-as-fix
Open

merge fixes for sret AS addrspacecast issue into rocm-7.2.0.1#2490
VigneshwarJ wants to merge 4 commits into
release/rocm-rel-7.2.0.1from
amd/dev/vjayakum/rocm-rel-7.2.0.1/sret-as-fix

Conversation

@VigneshwarJ
Copy link
Copy Markdown

No description provided.

…#183639)

When a HIP kernel uses placement new with a function returning an
aggregate via sret (e.g. `new (out) T(make_t())`), and the placement
destination is in global memory (addrspace 1), the sret pointer was
addrspacecast'd to addrspace 5 (private), producing an invalid pointer
that faults at runtime.

Instead of casting the caller's pointer directly, materialise a
temporary alloca in the callee's expected address space, pass that as
the sret argument, and copy the result back to the original destination
after the call.

(cherry picked from commit e2f7f83)
llvm#185091)

…et types

Fix for buildbot crash on llvm#183639
The UseTemp path in AggExprEmitter::withReturnValueSlot copies back via
EmitAggregateCopy, which asserts that the type has a trivial copy/move
constructor or assignment operator. Gate the DestASMismatch condition on
isTriviallyCopyableType so that non-trivially-copyable types (e.g.
std::exception_ptr) fall through to the addrspacecast path instead.

Fix buildbot crash:
https://lab.llvm.org/buildbot/#/builders/73/builds/19803

(cherry picked from commit 337fed3)
classifyReturnType used getAllocaAddrSpace() for sret, which is wrong
on targets like AMDGPU where alloca lives in addrspace(5). For types
with deleted copy/move constructors, there is no way to construct into
a temp and copy out — the sret pointer must point directly to the caller's
destination in the default address space.

Add a target hook getSRetAddrSpace() so AMDGPU can return LangAS::Default
for non-register-passable types.

Fixes issue llvm#185744

(cherry picked from commit de82b47)
… (llvm#193850)

After llvm#186275, the sret address space can differ from the alloca address
space (e.g., AS 0 vs AS 5 on AMDGPU). In CGCall.cpp EmitCall(), when a
discarded-value sret temporary is created, SRetPtr is allocated in the
alloca AS and a lifetime.start is emitted. The pointer is then
addrspacecast'd to match the sret AS, but the CallLifetimeEnd cleanup
was using the addrspacecast'd pointer, triggering an assertion in
EmitLifetimeEnd ("Pointer should be in alloca address space").

Saves the original alloca pointer before the addrspacecast and uses it
for the lifetime-end cleanup.

Fixes buildbot failure: hip-third-party-libs-tests

(cherry picked from commit 528e673)
@VigneshwarJ VigneshwarJ requested review from bcahoon and ronlieb May 11, 2026 20:57
@ronlieb ronlieb requested a review from arjun-raj-kuppala May 12, 2026 02:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant