fix: load Windows backend modules with altered search path by leehack · Pull Request #227 · leehack/llamadart

leehack · 2026-06-19T02:03:11Z

Summary

Temporarily preload the Windows CUDA backend module itself with LoadLibraryExW(..., LOAD_WITH_ALTERED_SEARCH_PATH) before handing the same path to llama.cpp's ggml_backend_load().
Keep bundled CUDA redistributable preloading as a best-effort compatibility path, but no longer rely on it as the primary loader fix.
Add regression coverage for the altered-search-path backend preload, update the dependency-preload test wording, and document the fix in CHANGELOG.md.

Context

Elana reported in leehack/llamadart-native#22 that llamadart-native b9694 bundles cudart64_12.dll, cublas64_12.dll, and cublasLt64_12.dll beside the CUDA backend, but llamadart still fails to load CUDA with an empty reason and enumerates 0 devices. Upstream llama.cpp b9700 CUDA works on the same RTX 5050, so this points at Windows dependency resolution in the package load path rather than the CUDA binary itself.

On Windows, preloading the CUDA redistributables by absolute path is not enough if llama.cpp later calls plain LoadLibraryW for ggml-cuda.dll; Windows can still fail to resolve module-owned transitive imports from the bundle directory. The package-side fix now preloads the backend module with LOAD_WITH_ALTERED_SEARCH_PATH, then releases the temporary preload reference after ggml_backend_load() takes ownership through the llama.cpp registry path.

Existing issue check

Related current report: leehack/llamadart-native#22.
Checked leehack/llamadart issues/PRs for CUDA DLL Windows, ggml-cuda, cudart64 cublas64, LoadLibraryEx, and backend dependencies; no open duplicate package-side PR was found. Older related issues/PRs include Blackwell (RTX 50-series, sm_120) unsupported on Windows: Vulkan crash + CUDA runtime too old #111, Windows build bundles CUDA DLLs despite backends: [vulkan, cpu] configuration #114, and Filter backend-owned runtime DLLs #115.

Test Plan

dart format lib/src/backends/llama_cpp/llama_cpp_service.dart test/unit/backends/llama_cpp/llama_cpp_service_test.dart
git diff --check origin/watcher/windows-cuda-dll-preload..HEAD
dart analyze lib/src/backends/llama_cpp/llama_cpp_service.dart test/unit/backends/llama_cpp/llama_cpp_service_test.dart
dart test test/unit/backends/llama_cpp/llama_cpp_service_test.dart - 72 tests passed locally on macOS
Fixed Copilot review feedback by renaming the helper to windowsBackendDependencyPaths and clarifying that it returns absolute paths; resolved the now-outdated thread.
Latest-head CI for e2f68f9c35fc15df25b04e319828e6506ebdd335 passed: Analyze & Lint, companion packages, Native Prompt Reuse Parity, docs build, Linux VM coverage, Chrome web tests, Test Native (macos-latest), Test Native (windows-latest), coverage aggregator, and chat app PR preview.
Real Windows CUDA smoke: Elana retested e2f68f9 on the RTX 5050 with the native bundle directory not on PATH; listGpuDevices(probeBackends: [GpuBackend.cuda]) loaded ggml-cuda.dll from .dart_tool/lib and enumerated the RTX 5050 successfully.

Follow-up

After merge, update/close the linked native issue with the confirmed package-side loader fix and the RTX 5050 no-PATH validation result.

github-actions · 2026-06-19T02:04:39Z

Chat app preview removed for leehack/llamadart-chat-pr-227.

Copilot

Pull request overview

This PR improves Windows CUDA backend loading by preloading bundled CUDA redistributable DLLs (e.g., cudart, cublas, cublasLt) from the resolved native backend bundle directory before attempting to load ggml-cuda.dll. This makes CUDA backend discovery less dependent on PATH/current working directory behavior on Windows and adds a regression test plus a changelog entry.

Changes:

Preload Windows CUDA dependency DLLs (by absolute path) before loading the CUDA backend module, keeping the DynamicLibrary handles alive for the service lifetime.
Add a unit test verifying dependency selection and preload ordering for the Windows CUDA bundle layout.
Document the fix under a new ## Unreleased section in CHANGELOG.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
lib/src/backends/llama_cpp/llama_cpp_service.dart	Adds Windows-only dependency preloading and helper logic to select/sort CUDA DLLs.
test/unit/backends/llama_cpp/llama_cpp_service_test.dart	Adds regression coverage for Windows CUDA dependency selection/order.
CHANGELOG.md	Documents the Windows CUDA preload fix under `Unreleased`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

elana-voss · 2026-06-19T06:37:38Z

On the PR branch (watcher/windows-cuda-dll-preload), clean rebuild, llamadart_native_backends: [cpu, vulkan, cuda], no PATH changes: still 0 devices.

load_backend: failed to load …\.dart_tool\lib\ggml-cuda.dll:

The empty reason is Win32 error 126 (module not found). Testing the bundled ggml-cuda.dll directly:

plain LoadLibrary, no preload → FAIL 126
preload cudart64_12/cublas64_12/cublasLt64_12 by absolute path, then plain LoadLibrary → still FAIL 126
LoadLibraryEx(…, LOAD_WITH_ALTERED_SEARCH_PATH) → OK

So preloading the redistributables doesn't resolve it — the loader still can't find the CUDA module's dependencies in the bundle dir. Adding the bundle directory to the DLL search path does. Confirmed end-to-end: with the bundle dir on PATH, the same listGpuDevices(probeBackends: [GpuBackend.cuda]) returns the GPU:

NVIDIA GeForce RTX 5050 Laptop GPU [CUDA0] discreteGpu 8.0 GiB
devices=1

Suggest loading the backend with LOAD_WITH_ALTERED_SEARCH_PATH (or AddDllDirectory on the bundle dir) rather than preloading by absolute path.

leehack · 2026-06-19T09:28:17Z

Thanks — this is exactly the missing loader behavior. Preloading cudart/cublas by absolute path is not enough if ggml_backend_load() later calls plain LoadLibraryW for ggml-cuda.dll; Windows can still fail to resolve module-owned transitive imports from the bundle directory.

I updated the PR to temporarily preload the backend module itself with LoadLibraryExW(..., LOAD_WITH_ALTERED_SEARCH_PATH) before handing the same path to llama.cpp's ggml_backend_load(). The explicit CUDA redistributable preload remains only as a best-effort compatibility path.

The latest head e2f68f9 is green in CI, including Test Native (windows-latest), but I still cannot run the real RTX 5050/no-PATH CUDA smoke from this macOS host. Could you retry the same no-PATH CUDA probe on the RTX 5050 setup when you have a chance?

elana-voss · 2026-06-19T11:22:17Z

Confirmed fixed on e2f68f9. Retested the same no-PATH CUDA probe on the RTX 5050 (bundle directory not on PATH):

load_backend: loaded CUDA backend from ...\.dart_tool\lib\ggml-cuda.dll
ggml_cuda_init: found 1 CUDA devices (Total VRAM: 8150 MiB):
  Device 0: NVIDIA GeForce RTX 5050 Laptop GPU, compute capability 12.0, VMM: yes, VRAM: 8150 MiB
  NVIDIA GeForce RTX 5050 Laptop GPU [CUDA0] type=discreteGpu mem=8.0 GiB
CUDA_PROBE_OK devices=1

Before this commit the same probe printed load_backend: failed to load ...ggml-cuda.dll: (empty reason = Win32 error 126) and returned 0 devices. Preloading the backend module itself with LoadLibraryExW(LOAD_WITH_ALTERED_SEARCH_PATH) resolves the module-owned transitive imports from the bundle directory. listGpuDevices(probeBackends: [GpuBackend.cuda]) now enumerates the GPU with no PATH workaround. Good to merge from my side.

Copilot AI review requested due to automatic review settings June 19, 2026 02:03

Copilot started reviewing on behalf of leehack June 19, 2026 02:03 View session

Copilot AI reviewed Jun 19, 2026

View reviewed changes

Comment thread lib/src/backends/llama_cpp/llama_cpp_service.dart Outdated

fix: preload Windows CUDA backend dependencies

10fe7c8

leehack force-pushed the watcher/windows-cuda-dll-preload branch from 7ba8287 to 10fe7c8 Compare June 19, 2026 02:10

leehack mentioned this pull request Jun 19, 2026

CUDA backend hard-crashes on RTX 5050 Blackwell during backend load leehack/llamadart-native#22

Closed

fix: load Windows backend modules with altered search path

e2f68f9

leehack changed the title ~~fix: preload Windows CUDA backend dependencies~~ fix: load Windows backend modules with altered search path Jun 19, 2026

leehack merged commit d824402 into main Jun 19, 2026
11 checks passed

leehack deleted the watcher/windows-cuda-dll-preload branch June 19, 2026 11:56

leehack mentioned this pull request Jun 19, 2026

Prepare 0.8.3 release #228

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: load Windows backend modules with altered search path#227

fix: load Windows backend modules with altered search path#227
leehack merged 2 commits into
mainfrom
watcher/windows-cuda-dll-preload

leehack commented Jun 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

elana-voss commented Jun 19, 2026

Uh oh!

leehack commented Jun 19, 2026

Uh oh!

elana-voss commented Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

leehack commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Existing issue check

Test Plan

Follow-up

Uh oh!

github-actions Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

elana-voss commented Jun 19, 2026

Uh oh!

leehack commented Jun 19, 2026

Uh oh!

elana-voss commented Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

leehack commented Jun 19, 2026 •

edited

Loading

github-actions Bot commented Jun 19, 2026 •

edited

Loading