[Comgr][hotswap] Add bare-bones hotswap transpiler scaffolding#2438
[Comgr][hotswap] Add bare-bones hotswap transpiler scaffolding#2438tgymnich wants to merge 1 commit into
Conversation
|
@tgymnich per today's call, would you be okay with landing the refactor of moving the transpiler into |
|
Disclaimer: I've gathered most of these through Claude Opus 4.7, but I believe they are retrieved through the agent looking at the AGENT_CONVENTIONS.md file and the current conventions that we follow in Comgr. Scaffolding isn't wired into the comgr build. amd/comgr/CMakeLists.txt is unchanged, so neither hotswap-transpiler nor transpiler_tests is built under make check-comgr. HOTSWAP_TRANSPILER_BUILD_TOOLS also defaults OFF in the embedded case, so the gtest stays unbuilt even after a future add_subdirectory(). The contract this patch advertises is currently unverifiable in CI. Duplicates amd/comgr/test-unit/ infrastructure. That tree already has 3-tier gtest discovery (llvm_gtest → find_package(GTest CONFIG) → LLVM-installed gtest), comgr_match_llvm_rtti(), and check-comgr plumbing. The new find_package(GTest REQUIRED) is strictly the 2nd tier and hard-fails in monorepo builds without a system gtest. Per AGENT_CONVENTIONS.md §1, I'd prefer the transpiler refactor land first and this slot into the existing pattern. License headers missing or incomplete. code_object_utils.hpp, raise_failure.{hpp,cpp}, raiser.hpp, tests/test_main.cpp have no header at all; raiser.cpp and raiser_empty_module_test.cpp have the banner but lack the Apache / SPDX block. Compare test-unit/HotswapElfTest.cpp. Conventions diverge from the rest of amd/comgr/. .clang-tidy mandates CamelCase for members / params / locals — the new code is lowerCamelCase throughout (kernargSegmentSize, kernelName, funcTy). Headers are .hpp (rest of comgr is .h); namespace is transpiler (rest is COMGR::hotswap). Cheaper to fix now than after #2437 / #2439 stack on top. Test doesn't verify the advertised contract, and raiseToIR masks failures. Banner says "body is ret void"; the test only checks !fn->empty() and the calling convention — not the terminator, the target triple, the (missing) DataLayout, or any failure path. raiseToIR itself /…/s every parameter except kernelName, never calls parseTargetIdentifier(sourceISA) (AGENT_CONVENTIONS.md §1 reuse list), doesn't reject empty inputs, and unconditionally sets success = true. Both downstream PRs will inherit this. Smaller stuff (dead fields on RaiseResult, single-valued RaiseFailureReason plus its unreachable fallback in reasonString, the auto / rfind(...,0)==0 / hand-rolled (x+7)&~7 / signed int sizes in KernelMeta::implicitArgsBase()) |
2675e92 to
3831ee0
Compare
martin-luecke
left a comment
There was a problem hiding this comment.
Thanks, this looks much better with the recent changes. The scaffold is small, the COMGR test-unit integration is the right direction, and the tests now cover the advertised contract much better.
I left a few comments. Once those are addressed and CI is green, I’m fine with merging this PR.
- Not approving because I am partly author of this.
| set(CMAKE_EXPORT_COMPILE_COMMANDS ON) | ||
|
|
||
| # --- LLVM --------------------------------------------------------------------- | ||
| if(NOT TARGET LLVMSupport) |
There was a problem hiding this comment.
CI is currently failing because this file skips include(AddLLVM) when LLVMSupport exists, but llvm_update_compile_flags is still not defined in the COMGR/TheRock configure path. Could we instead check if(NOT COMMAND llvm_update_compile_flags) before including AddLLVM?
| # Always added so the static library is available to the test-unit suite; | ||
| # linking into amd_comgr happens in a later patch via the | ||
| # COMGR_ENABLE_HOTSWAP_TRANSPILE option. | ||
| add_subdirectory(hotswap) |
There was a problem hiding this comment.
This makes the hotswap subproject configure unconditionally in normal COMGR builds, before any public API lands. Is that intentional for test-unit coverage? If yes, the hotswap CMake needs to be robust in all COMGR/TheRock configurations. If not, consider gating or EXCLUDE_FROM_ALL until integration lands.
There was a problem hiding this comment.
Yes, this is intentional. Better figure this out as soon as possible, and keep CI green.
|
Are we wanting to keep comgr/hotswap or move to comgr/src/hotswap? Having a new top-level directory with source code would be a departure from the existing setup. (Is hotswap a Comgr appendage, or an integrated feature) |
|
I'm okay with either. TBH since we provide an API through Comgr for the hotswap rewrite, I think keeping it under If this was exclusively a Comgr API consumer, I would do |
9f6c2ed to
ec69509
Compare
ec69509 to
3b59a99
Compare
|
I re-worked this with the clang-tidy rules and the new home in |
| // Ideally we would reuse `COMGR::parseTargetIdentifier`, but that helper | ||
| // currently lives behind the comgr-metadata layer in `src/comgr.cpp` and | ||
| // is not reachable from the hotswap subproject. As a stop-gap, validate | ||
| // the AMDGPU processor name through `llvm::AMDGPU::parseArchAMDGCN`. |
There was a problem hiding this comment.
Mostly LGTM.
However, what can we do to resolve this ? We want to reuse as much as possible from Comgr / LLVM and lay a foundation for doing that from the get-go.
There was a problem hiding this comment.
I'm at it for PR2, this may be backported to this PR
3b59a99 to
91c4a7e
Compare
91c4a7e to
cceb980
Compare
Lands the minimal scaffolding
needed to build a `hotswap-transpiler` static library:
* `raiser.{hpp,cpp}` — `raiseToIR()` entry point. The implementation
creates an `llvm::Module` with a single `ret void` AMDGPU kernel
function for any input. ELF ingestion, MC disassembly and the
per-format raise handlers land in subsequent patches.
* `raise_failure.{hpp,cpp}` — structured failure values consumed by
`RaiseResult::failure`.
* `code_object_utils.h` — `KernelMeta` struct used in the
`raiseToIR` signature.
* Minimal `CMakeLists.txt` linking only `LLVMCore`, `LLVMSupport`,
`LLVMTargetParser`. The full library/include surface grows
incrementally as later patches need it.
* `tests/raiser_empty_module_test.cpp` (gtest) verifies the
scaffolding contract: an empty input produces a well-formed
module with one `AMDGPU_KERNEL` function whose body is `ret void`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cceb980 to
7b8ccf2
Compare
Lands the minimal scaffolding
needed to build a
hotswap-transpilerstatic library:raiser.{hpp,cpp}—raiseToIR()entry point. The implementation creates anllvm::Modulewith a singleret voidAMDGPU kernel function for any input. ELF ingestion, MC disassembly and the per-format raise handlers land in subsequent patches.raise_failure.{hpp,cpp}— structured failure values consumed byRaiseResult::failure.code_object_utils.hpp—KernelMetastruct used in theraiseToIRsignature.CMakeLists.txtlinking onlyLLVMCore,LLVMSupport,LLVMTargetParser. The full library/include surface grows incrementally as later patches need it.tests/raiser_empty_module_test.cpp(gtest) verifies the scaffolding contract: an empty input produces a well-formed module with oneAMDGPU_KERNELfunction whose body isret void.