Skip to content

fix: load Windows backend modules with altered search path#227

Merged
leehack merged 2 commits into
mainfrom
watcher/windows-cuda-dll-preload
Jun 19, 2026
Merged

fix: load Windows backend modules with altered search path#227
leehack merged 2 commits into
mainfrom
watcher/windows-cuda-dll-preload

Conversation

@leehack

@leehack leehack commented Jun 19, 2026

Copy link
Copy Markdown
Owner

Summary

  • Temporarily preload the Windows CUDA backend module itself with LoadLibraryExW(..., LOAD_WITH_ALTERED_SEARCH_PATH) before handing the same path to llama.cpp's ggml_backend_load().
  • Keep bundled CUDA redistributable preloading as a best-effort compatibility path, but no longer rely on it as the primary loader fix.
  • Add regression coverage for the altered-search-path backend preload, update the dependency-preload test wording, and document the fix in CHANGELOG.md.

Context

Elana reported in leehack/llamadart-native#22 that llamadart-native b9694 bundles cudart64_12.dll, cublas64_12.dll, and cublasLt64_12.dll beside the CUDA backend, but llamadart still fails to load CUDA with an empty reason and enumerates 0 devices. Upstream llama.cpp b9700 CUDA works on the same RTX 5050, so this points at Windows dependency resolution in the package load path rather than the CUDA binary itself.

On Windows, preloading the CUDA redistributables by absolute path is not enough if llama.cpp later calls plain LoadLibraryW for ggml-cuda.dll; Windows can still fail to resolve module-owned transitive imports from the bundle directory. The package-side fix now preloads the backend module with LOAD_WITH_ALTERED_SEARCH_PATH, then releases the temporary preload reference after ggml_backend_load() takes ownership through the llama.cpp registry path.

Existing issue check

Test Plan

  • dart format lib/src/backends/llama_cpp/llama_cpp_service.dart test/unit/backends/llama_cpp/llama_cpp_service_test.dart
  • git diff --check origin/watcher/windows-cuda-dll-preload..HEAD
  • dart analyze lib/src/backends/llama_cpp/llama_cpp_service.dart test/unit/backends/llama_cpp/llama_cpp_service_test.dart
  • dart test test/unit/backends/llama_cpp/llama_cpp_service_test.dart - 72 tests passed locally on macOS
  • Fixed Copilot review feedback by renaming the helper to windowsBackendDependencyPaths and clarifying that it returns absolute paths; resolved the now-outdated thread.
  • Latest-head CI for e2f68f9c35fc15df25b04e319828e6506ebdd335 passed: Analyze & Lint, companion packages, Native Prompt Reuse Parity, docs build, Linux VM coverage, Chrome web tests, Test Native (macos-latest), Test Native (windows-latest), coverage aggregator, and chat app PR preview.
  • Real Windows CUDA smoke: Elana retested e2f68f9 on the RTX 5050 with the native bundle directory not on PATH; listGpuDevices(probeBackends: [GpuBackend.cuda]) loaded ggml-cuda.dll from .dart_tool/lib and enumerated the RTX 5050 successfully.

Follow-up

  • After merge, update/close the linked native issue with the confirmed package-side loader fix and the RTX 5050 no-PATH validation result.

Copilot AI review requested due to automatic review settings June 19, 2026 02:03
@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Chat app preview removed for leehack/llamadart-chat-pr-227.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Windows CUDA backend loading by preloading bundled CUDA redistributable DLLs (e.g., cudart, cublas, cublasLt) from the resolved native backend bundle directory before attempting to load ggml-cuda.dll. This makes CUDA backend discovery less dependent on PATH/current working directory behavior on Windows and adds a regression test plus a changelog entry.

Changes:

  • Preload Windows CUDA dependency DLLs (by absolute path) before loading the CUDA backend module, keeping the DynamicLibrary handles alive for the service lifetime.
  • Add a unit test verifying dependency selection and preload ordering for the Windows CUDA bundle layout.
  • Document the fix under a new ## Unreleased section in CHANGELOG.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
lib/src/backends/llama_cpp/llama_cpp_service.dart Adds Windows-only dependency preloading and helper logic to select/sort CUDA DLLs.
test/unit/backends/llama_cpp/llama_cpp_service_test.dart Adds regression coverage for Windows CUDA dependency selection/order.
CHANGELOG.md Documents the Windows CUDA preload fix under Unreleased.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/src/backends/llama_cpp/llama_cpp_service.dart Outdated
@elana-voss

Copy link
Copy Markdown
Contributor

On the PR branch (watcher/windows-cuda-dll-preload), clean rebuild, llamadart_native_backends: [cpu, vulkan, cuda], no PATH changes: still 0 devices.

load_backend: failed to load …\.dart_tool\lib\ggml-cuda.dll:

The empty reason is Win32 error 126 (module not found). Testing the bundled ggml-cuda.dll directly:

  • plain LoadLibrary, no preload → FAIL 126
  • preload cudart64_12/cublas64_12/cublasLt64_12 by absolute path, then plain LoadLibrary → still FAIL 126
  • LoadLibraryEx(…, LOAD_WITH_ALTERED_SEARCH_PATH) → OK

So preloading the redistributables doesn't resolve it — the loader still can't find the CUDA module's dependencies in the bundle dir. Adding the bundle directory to the DLL search path does. Confirmed end-to-end: with the bundle dir on PATH, the same listGpuDevices(probeBackends: [GpuBackend.cuda]) returns the GPU:

NVIDIA GeForce RTX 5050 Laptop GPU [CUDA0] discreteGpu 8.0 GiB
devices=1

Suggest loading the backend with LOAD_WITH_ALTERED_SEARCH_PATH (or AddDllDirectory on the bundle dir) rather than preloading by absolute path.

@leehack

leehack commented Jun 19, 2026

Copy link
Copy Markdown
Owner Author

Thanks — this is exactly the missing loader behavior. Preloading cudart/cublas by absolute path is not enough if ggml_backend_load() later calls plain LoadLibraryW for ggml-cuda.dll; Windows can still fail to resolve module-owned transitive imports from the bundle directory.

I updated the PR to temporarily preload the backend module itself with LoadLibraryExW(..., LOAD_WITH_ALTERED_SEARCH_PATH) before handing the same path to llama.cpp's ggml_backend_load(). The explicit CUDA redistributable preload remains only as a best-effort compatibility path.

The latest head e2f68f9 is green in CI, including Test Native (windows-latest), but I still cannot run the real RTX 5050/no-PATH CUDA smoke from this macOS host. Could you retry the same no-PATH CUDA probe on the RTX 5050 setup when you have a chance?

@leehack leehack changed the title fix: preload Windows CUDA backend dependencies fix: load Windows backend modules with altered search path Jun 19, 2026
@elana-voss

Copy link
Copy Markdown
Contributor

Confirmed fixed on e2f68f9. Retested the same no-PATH CUDA probe on the RTX 5050 (bundle directory not on PATH):

load_backend: loaded CUDA backend from ...\.dart_tool\lib\ggml-cuda.dll
ggml_cuda_init: found 1 CUDA devices (Total VRAM: 8150 MiB):
  Device 0: NVIDIA GeForce RTX 5050 Laptop GPU, compute capability 12.0, VMM: yes, VRAM: 8150 MiB
  NVIDIA GeForce RTX 5050 Laptop GPU [CUDA0] type=discreteGpu mem=8.0 GiB
CUDA_PROBE_OK devices=1

Before this commit the same probe printed load_backend: failed to load ...ggml-cuda.dll: (empty reason = Win32 error 126) and returned 0 devices. Preloading the backend module itself with LoadLibraryExW(LOAD_WITH_ALTERED_SEARCH_PATH) resolves the module-owned transitive imports from the bundle directory. listGpuDevices(probeBackends: [GpuBackend.cuda]) now enumerates the GPU with no PATH workaround. Good to merge from my side.

@leehack leehack merged commit d824402 into main Jun 19, 2026
11 checks passed
@leehack leehack deleted the watcher/windows-cuda-dll-preload branch June 19, 2026 11:56
@leehack leehack mentioned this pull request Jun 19, 2026
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants