[Comgr] Skip embedded libc++ -idirafter when system C++ headers exist; fix VFS path mismatch with clang resource dir#2478
Conversation
…; fix VFS path mismatch with clang resource dir
This commit fixes two related issues with comgr's embedded libc++ headers
support that surface together when compiling HIP with `-nogpuinc` on
real-world Linux hosts.
Issue 1: libstdc++/libc++ header conflict (ROCm-issue-2445)
When `-nogpuinc` is active and embedded libc++ headers are loaded into
the VFS, comgr injected `-idirafter <llvm>/include/c++/v1`. On distros
that ship libstdc++ at clang's default include path (RHEL/manylinux/
Ubuntu), this produces a hybrid include chain. The host libstdc++
`<array>`/`<stdexcept>` pull in a transitive `<stddef.h>` from the
embedded libc++ overlay, whose `#include_next` then has no successor
under `-nogpuinc` and fails.
Example: `#include <array>` → libstdc++ `<array>` → libc++ `stddef.h`
(via VFS) → `#include_next <stddef.h>` → no next file → error.
The embedded set is partial by design (LIBCXX_USER_HEADERS in
cmake/LibcxxHeaders.cmake), so it can only safely substitute when no
host C++ standard library is present.
Fix: detect host libstdc++ / libc++ on disk (honoring `--sysroot=`) and
skip the `-idirafter` injection when found. Also skip when the user
passed `-nostdinc++`, `-nostdinc`, or `-nostdlibinc`. Add
`AMD_COMGR_USE_EMBEDDED_LIBCXX={auto,force,disable}` env override.
Issue 2: VFS embed paths drift from clang's resource-dir path
The clang Driver and the VFS embed code each constructed a clang binary
path from `LLVM_PATH`, but using different string-joining functions.
When `LLVM_PATH` was unset, the two disagreed:
Driver: `(Twine("") + "/bin/clang").str()` → "/bin/clang"
Embed: `SmallString<256>(""); append("bin","clang")` → "bin/clang"
Both then fed `clang::GetResourcesPath()`, which strips two parents and
appends `lib/clang/<ver>`:
"/bin/clang" → "/lib/clang/23" (absolute)
"bin/clang" → "lib/clang/23" (relative)
So clang searched `/lib/clang/23/include/stddef.h` while comgr planted
the file at `lib/clang/23/include/stddef.h`. Mismatch → header not
found.
Fix: extract a single `env::getClangBinaryPath()` helper used by both
the Driver and the VFS embed code, so the two paths cannot drift again.
When `LLVM_PATH` is unset, locate the real clang via `dladdr` on the
loaded libamd_comgr.so and probe sibling layouts (`<prefix>/llvm/bin/
clang` for ROCm, `<prefix>/bin/clang` for standard installs); fall back
to `/bin/clang` only if neither is found, which keeps Driver and VFS
consistent even on unusual layouts.
Tests: added compile_hip_stdlibcxx_conflict_test (system-libstdc++
chain), compile_hip_with_system_libcxx_test (asserts skip-detection
fires when system libc++ present), compile_hip_distroless_test
(forces embedded path via env override). All three skip cleanly on
Windows; the conflict test also skips when no clang resource dir is
reachable on disk (no false failure when prerequisites aren't present).
|
Can this be split into two separate PRs for the two issues? Also comments are a bit verbose (ok if needed for the complexity of the code, but otherwise shorter would be nice) |
|
Let me revive #1536, I think it gets at your second issue |
|
Thanks Jacob, makes sense to split. Plan: PR A will be a small follow-up to #1536 that updates the VFS embed site in |
This commit fixes two related issues with comgr's embedded libc++ headers
support that surface together when compiling HIP with
-nogpuinconreal-world Linux hosts.
Issue 1: libstdc++/libc++ header conflict (ROCm-issue-2445)
When
-nogpuincis active and embedded libc++ headers are loaded intothe VFS, comgr injected
-idirafter <llvm>/include/c++/v1. On distrosthat ship libstdc++ at clang's default include path (RHEL/manylinux/
Ubuntu), this produces a hybrid include chain. The host libstdc++
<array>/<stdexcept>pull in a transitive<stddef.h>from theembedded libc++ overlay, whose
#include_nextthen has no successorunder
-nogpuincand fails.Example:
#include <array>→ libstdc++<array>→ libc++stddef.h(via VFS) →
#include_next <stddef.h>→ no next file → error.The embedded set is partial by design (LIBCXX_USER_HEADERS in
cmake/LibcxxHeaders.cmake), so it can only safely substitute when no
host C++ standard library is present.
Fix: detect host libstdc++ / libc++ on disk (honoring
--sysroot=) andskip the
-idirafterinjection when found. Also skip when the userpassed
-nostdinc++,-nostdinc, or-nostdlibinc. AddAMD_COMGR_USE_EMBEDDED_LIBCXX={auto,force,disable}env override.Issue 2: VFS embed paths drift from clang's resource-dir path
The clang Driver and the VFS embed code each constructed a clang binary
path from
LLVM_PATH, but using different string-joining functions.When
LLVM_PATHwas unset, the two disagreed:Driver:
(Twine("") + "/bin/clang").str()→ "/bin/clang"Embed:
SmallString<256>(""); append("bin","clang")→ "bin/clang"Both then fed
clang::GetResourcesPath(), which strips two parents andappends
lib/clang/<ver>:"/bin/clang" → "/lib/clang/23" (absolute)
"bin/clang" → "lib/clang/23" (relative)
So clang searched
/lib/clang/23/include/stddef.hwhile comgr plantedthe file at
lib/clang/23/include/stddef.h. Mismatch → header notfound.
Fix: extract a single
env::getClangBinaryPath()helper used by boththe Driver and the VFS embed code, so the two paths cannot drift again.
When
LLVM_PATHis unset, locate the real clang viadladdron theloaded libamd_comgr.so and probe sibling layouts (
<prefix>/llvm/bin/ clangfor ROCm,<prefix>/bin/clangfor standard installs); fall backto
/bin/clangonly if neither is found, which keeps Driver and VFSconsistent even on unusual layouts.
Tests: added compile_hip_stdlibcxx_conflict_test (system-libstdc++
chain), compile_hip_with_system_libcxx_test (asserts skip-detection
fires when system libc++ present), compile_hip_distroless_test
(forces embedded path via env override). All three skip cleanly on
Windows; the conflict test also skips when no clang resource dir is
reachable on disk (no false failure when prerequisites aren't present).