Skip to content

[Comgr][hotswap] Add bare-bones hotswap transpiler scaffolding#2438

Open
tgymnich wants to merge 1 commit into
amd-stagingfrom
users/tgymnich/hotswap-pr-01-scaffolding
Open

[Comgr][hotswap] Add bare-bones hotswap transpiler scaffolding#2438
tgymnich wants to merge 1 commit into
amd-stagingfrom
users/tgymnich/hotswap-pr-01-scaffolding

Conversation

@tgymnich
Copy link
Copy Markdown

@tgymnich tgymnich commented May 7, 2026

Lands the minimal scaffolding
needed to build a hotswap-transpiler static library:

  • raiser.{hpp,cpp}raiseToIR() entry point. The implementation creates an llvm::Module with a single ret void AMDGPU kernel function for any input. ELF ingestion, MC disassembly and the per-format raise handlers land in subsequent patches.
  • raise_failure.{hpp,cpp} — structured failure values consumed by RaiseResult::failure.
  • code_object_utils.hppKernelMeta struct used in the raiseToIR signature.
  • Minimal CMakeLists.txt linking only LLVMCore, LLVMSupport, LLVMTargetParser. The full library/include surface grows incrementally as later patches need it.
  • tests/raiser_empty_module_test.cpp (gtest) verifies the scaffolding contract: an empty input produces a well-formed module with one AMDGPU_KERNEL function whose body is ret void.

@tgymnich
Copy link
Copy Markdown
Author

tgymnich commented May 7, 2026

@z1-cciauto
Copy link
Copy Markdown
Collaborator

@chinmaydd chinmaydd added comgr Related to Code Object Manager hotswap Related to the Comgr Hotswap feature labels May 7, 2026
@chinmaydd
Copy link
Copy Markdown

@tgymnich per today's call, would you be okay with landing the refactor of moving the transpiler into hotswap/transpiler first or do we want to do that as a followup ?

@chinmaydd
Copy link
Copy Markdown

Disclaimer: I've gathered most of these through Claude Opus 4.7, but I believe they are retrieved through the agent looking at the AGENT_CONVENTIONS.md file and the current conventions that we follow in Comgr.


Scaffolding isn't wired into the comgr build. amd/comgr/CMakeLists.txt is unchanged, so neither hotswap-transpiler nor transpiler_tests is built under make check-comgr. HOTSWAP_TRANSPILER_BUILD_TOOLS also defaults OFF in the embedded case, so the gtest stays unbuilt even after a future add_subdirectory(). The contract this patch advertises is currently unverifiable in CI.

Duplicates amd/comgr/test-unit/ infrastructure. That tree already has 3-tier gtest discovery (llvm_gtest → find_package(GTest CONFIG) → LLVM-installed gtest), comgr_match_llvm_rtti(), and check-comgr plumbing. The new find_package(GTest REQUIRED) is strictly the 2nd tier and hard-fails in monorepo builds without a system gtest. Per AGENT_CONVENTIONS.md §1, I'd prefer the transpiler refactor land first and this slot into the existing pattern.

License headers missing or incomplete. code_object_utils.hpp, raise_failure.{hpp,cpp}, raiser.hpp, tests/test_main.cpp have no header at all; raiser.cpp and raiser_empty_module_test.cpp have the banner but lack the Apache / SPDX block. Compare test-unit/HotswapElfTest.cpp.

Conventions diverge from the rest of amd/comgr/. .clang-tidy mandates CamelCase for members / params / locals — the new code is lowerCamelCase throughout (kernargSegmentSize, kernelName, funcTy). Headers are .hpp (rest of comgr is .h); namespace is transpiler (rest is COMGR::hotswap). Cheaper to fix now than after #2437 / #2439 stack on top.

Test doesn't verify the advertised contract, and raiseToIR masks failures. Banner says "body is ret void"; the test only checks !fn->empty() and the calling convention — not the terminator, the target triple, the (missing) DataLayout, or any failure path. raiseToIR itself //s every parameter except kernelName, never calls parseTargetIdentifier(sourceISA) (AGENT_CONVENTIONS.md §1 reuse list), doesn't reject empty inputs, and unconditionally sets success = true. Both downstream PRs will inherit this.

Smaller stuff (dead fields on RaiseResult, single-valued RaiseFailureReason plus its unreachable fallback in reasonString, the auto / rfind(...,0)==0 / hand-rolled (x+7)&~7 / signed int sizes in KernelMeta::implicitArgsBase())

@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-01-scaffolding branch from 2675e92 to 3831ee0 Compare May 8, 2026 10:17
@z1-cciauto
Copy link
Copy Markdown
Collaborator

Copy link
Copy Markdown

@martin-luecke martin-luecke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks much better with the recent changes. The scaffold is small, the COMGR test-unit integration is the right direction, and the tests now cover the advertised contract much better.

I left a few comments. Once those are addressed and CI is green, I’m fine with merging this PR.

  • Not approving because I am partly author of this.

Comment thread amd/comgr/hotswap/raiser.cpp Outdated
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

# --- LLVM ---------------------------------------------------------------------
if(NOT TARGET LLVMSupport)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is currently failing because this file skips include(AddLLVM) when LLVMSupport exists, but llvm_update_compile_flags is still not defined in the COMGR/TheRock configure path. Could we instead check if(NOT COMMAND llvm_update_compile_flags) before including AddLLVM?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread amd/comgr/CMakeLists.txt Outdated
# Always added so the static library is available to the test-unit suite;
# linking into amd_comgr happens in a later patch via the
# COMGR_ENABLE_HOTSWAP_TRANSPILE option.
add_subdirectory(hotswap)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the hotswap subproject configure unconditionally in normal COMGR builds, before any public API lands. Is that intentional for test-unit coverage? If yes, the hotswap CMake needs to be robust in all COMGR/TheRock configurations. If not, consider gating or EXCLUDE_FROM_ALL until integration lands.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is intentional. Better figure this out as soon as possible, and keep CI green.

Comment thread amd/comgr/hotswap/code_object_utils.hpp Outdated
@lamb-j
Copy link
Copy Markdown
Collaborator

lamb-j commented May 8, 2026

Are we wanting to keep comgr/hotswap or move to comgr/src/hotswap? Having a new top-level directory with source code would be a departure from the existing setup. (Is hotswap a Comgr appendage, or an integrated feature)

@chinmaydd
Copy link
Copy Markdown

I'm okay with either. TBH since we provide an API through Comgr for the hotswap rewrite, I think keeping it under comgr/src/hotswap makes sense to me.

If this was exclusively a Comgr API consumer, I would do comgr/hotswap.

@tgymnich tgymnich marked this pull request as draft May 8, 2026 21:26
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-01-scaffolding branch 2 times, most recently from 9f6c2ed to ec69509 Compare May 11, 2026 12:32
@tgymnich tgymnich marked this pull request as ready for review May 11, 2026 12:34
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-01-scaffolding branch from ec69509 to 3b59a99 Compare May 11, 2026 12:42
@tgymnich
Copy link
Copy Markdown
Author

I re-worked this with the clang-tidy rules and the new home in comgr/src/hotswap

Comment on lines +30 to +33
// Ideally we would reuse `COMGR::parseTargetIdentifier`, but that helper
// currently lives behind the comgr-metadata layer in `src/comgr.cpp` and
// is not reachable from the hotswap subproject. As a stop-gap, validate
// the AMDGPU processor name through `llvm::AMDGPU::parseArchAMDGCN`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM.

However, what can we do to resolve this ? We want to reuse as much as possible from Comgr / LLVM and lay a foundation for doing that from the get-go.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm at it for PR2, this may be backported to this PR

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

Copy link
Copy Markdown

@chinmaydd chinmaydd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-01-scaffolding branch from 3b59a99 to 91c4a7e Compare May 11, 2026 20:21
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-01-scaffolding branch from 91c4a7e to cceb980 Compare May 11, 2026 20:48
Lands the minimal scaffolding
needed to build a `hotswap-transpiler` static library:

  * `raiser.{hpp,cpp}` — `raiseToIR()` entry point. The implementation
    creates an `llvm::Module` with a single `ret void` AMDGPU kernel
    function for any input. ELF ingestion, MC disassembly and the
    per-format raise handlers land in subsequent patches.
  * `raise_failure.{hpp,cpp}` — structured failure values consumed by
    `RaiseResult::failure`.
  * `code_object_utils.h` — `KernelMeta` struct used in the
    `raiseToIR` signature.
  * Minimal `CMakeLists.txt` linking only `LLVMCore`, `LLVMSupport`,
    `LLVMTargetParser`. The full library/include surface grows
    incrementally as later patches need it.
  * `tests/raiser_empty_module_test.cpp` (gtest) verifies the
    scaffolding contract: an empty input produces a well-formed
    module with one `AMDGPU_KERNEL` function whose body is `ret void`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tgymnich tgymnich force-pushed the users/tgymnich/hotswap-pr-01-scaffolding branch from cceb980 to 7b8ccf2 Compare May 11, 2026 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comgr Related to Code Object Manager hotswap Related to the Comgr Hotswap feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants