chore(dflash): enforce sm_89 user override and keep BSA enabled#137
chore(dflash): enforce sm_89 user override and keep BSA enabled#137javierpazo wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
1 issue found across 1 file
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="dflash/CMakeLists.txt">
<violation number="1" location="dflash/CMakeLists.txt:193">
P2: Removing the SM80 guard can make `test_flashprefill_kernels` link fail when FlashPrefill sources are not compiled into `dflash27b`.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Two CMake-side rough edges that bit me on Windows MSVC + CUDA 12.x
on RTX 6000 Ada (sm_89, Ada-only):
1. CUDA architectures: when no explicit override is provided,
the previous CMakeLists could fall back to `75;86`, which
caused silent build issues on Ada-only setups. This change
respects DFLASH27B_USER_CUDA_ARCHITECTURES (e.g. `89`) and
uses it consistently across the dflash and submodule
ggml/llama.cpp consumers.
2. BSA was sometimes silently disabled depending on detection
order. DFLASH27B_ENABLE_BSA is now respected as an explicit
opt-in/opt-out and a clear status line is printed at
configure time.
Net effect: a single-arch Ada-only build with BSA enabled is
reproducible from a clean checkout. Default behaviour (no
DFLASH27B_USER_CUDA_ARCHITECTURES set, BSA on) is preserved for
existing users.
Validation:
cmake -S dflash -B dflash/build/Release \
-DCMAKE_BUILD_TYPE=Release \
-DDFLASH27B_USER_CUDA_ARCHITECTURES=89 \
-DDFLASH27B_ENABLE_BSA=ON
cmake --build dflash/build/Release --target test_dflash --parallel 8
-> BUILD_EXIT_CODE=0, sm_89 single-arch confirmed.
Verification vs existing community PRs:
COMP-COMPL with Luce-Org#48 ("auto-detect GPU arch to prevent sm_120a on
consumer Blackwell", open) and Luce-Org#91 ("expose BSA config as CLI
flags with safety warnings", merged 2026-05-04). Luce-Org#48 covers
auto-detect; Luce-Org#91 covers runtime CLI. This PR covers the
build-time CMake side: respect the user's explicit
DFLASH27B_USER_CUDA_ARCHITECTURES override and keep
DFLASH27B_ENABLE_BSA honest. The three PRs together give
sensible defaults per hardware tier.
Author: Javier Pazo <xabicasa@gmail.com>
d423264 to
297fc74
Compare
|
@cubic-dev-ai good catch — restored the SM80 guard around \ est_flashprefill_kernels. The kernel sources are gated on _dflash27b_min_sm GREATER_EQUAL 80\ further up (\DFLASH27B_HAVE_FLASHPREFILL), so the test would have failed to link on legacy arches. Force-pushed; only the test target is now conditional, the rest of the CMake cleanup is unchanged. |
@javierpazo Makes sense — the The force-push looks good. |
Record the 2026-05-27 18:17 scheduled run, including fresh PR classification and targeted worktree/Codex probes for stale PRs Luce-Org#137 and Luce-Org#135.
Revalidate PRs Luce-Org#137 and Luce-Org#135 in isolated worktrees, record conflict shape and current-layout recommendations. Upstream main remains unchanged at 4f4d82e.
Keep the current server/CMakeLists.txt CUDA architecture and BSA handling; the old dflash/CMakeLists.txt file modified by PR Luce-Org#137 has been removed from the current tree, so resolving the modify/delete conflict by preserving the deletion records the PR head as represented without reintroducing the retired layout.
chore(dflash): enforce sm_89 user override and keep BSA enabled
Two CMake-side rough edges that bit me on Windows MSVC + CUDA 12.x
on RTX 6000 Ada (sm_89, Ada-only):
CUDA architectures: when no explicit override is provided,
the previous CMakeLists could fall back to
75;86, whichcaused silent build issues on Ada-only setups. This change
respects DFLASH27B_USER_CUDA_ARCHITECTURES (e.g.
89) anduses it consistently across the dflash and submodule
ggml/llama.cpp consumers.
BSA was sometimes silently disabled depending on detection
order. DFLASH27B_ENABLE_BSA is now respected as an explicit
opt-in/opt-out and a clear status line is printed at
configure time.
Net effect: a single-arch Ada-only build with BSA enabled is
reproducible from a clean checkout. Default behaviour (no
DFLASH27B_USER_CUDA_ARCHITECTURES set, BSA on) is preserved for
existing users.
Validation:
cmake -S dflash -B dflash/build/Release
-DCMAKE_BUILD_TYPE=Release
-DDFLASH27B_USER_CUDA_ARCHITECTURES=89
-DDFLASH27B_ENABLE_BSA=ON
cmake --build dflash/build/Release --target test_dflash --parallel 8
-> BUILD_EXIT_CODE=0, sm_89 single-arch confirmed.
Verification vs existing community PRs:
COMP-COMPL with #48 ("auto-detect GPU arch to prevent sm_120a on
consumer Blackwell", open) and #91 ("expose BSA config as CLI
flags with safety warnings", merged 2026-05-04). #48 covers
auto-detect; #91 covers runtime CLI. This PR covers the
build-time CMake side: respect the user's explicit
DFLASH27B_USER_CUDA_ARCHITECTURES override and keep
DFLASH27B_ENABLE_BSA honest. The three PRs together give
sensible defaults per hardware tier.
Author: Javier Pazo xabicasa@gmail.com