[Chess] Stockfish-style legal move generation: ~11x faster step, fixes silent move truncation#1318
Open
gweber wants to merge 2 commits into
Open
[Chess] Stockfish-style legal move generation: ~11x faster step, fixes silent move truncation#1318gweber wants to merge 2 commits into
gweber wants to merge 2 commits into
Conversation
Replace the make-move legality filter (200x _apply_move + _is_checked per position) with precomputed masks, computed once per position: - checker detection around the king (near pieces + sliders with clear path) - check-evasion target mask (capture the single checker or block its line; empty on double check) - absolute pin rays: first own piece along each of the 8 king rays, backed by a matching enemy slider, restricted to moves along that ray (new RAYS/RAY_DIR/IS_DIAG_DIR tables) - king-danger squares evaluated with the king lifted off the board, so sliders attack through the vacated square Every pseudo-legal move is now decided by table lookups. En passant (at most two candidates, nasty edge cases) keeps the exact make-move test. Castling and underpromotion logic are unchanged. _is_attacked now takes the board array instead of GameState. This also fixes a silent correctness bug: the old code compacted candidate moves through jnp.nonzero(..., size=200), silently dropping legal moves in positions with more than 200 pseudo-legal candidates (e.g. multi-queen positions reached in self-play; the known 218-move position returned only 200). There is no cap anymore. is_terminal now reuses hash_history[0] (written by _update_history) instead of recomputing the Zobrist hash. Validation (tests/diff_vs_python_chess.py, python-chess as ground truth): 17 tricky FENs (kiwipete, perft 3-6, ep pins, castling through check, double check, 218-move position) with full depth-2 expansion plus 20 random games - zero mismatches. Benchmark (RTX/GB10, batch 2048, vmap+jit): legal_action_mask 7.505 ms -> 0.535 ms (14.0x) full step 8.303 ms -> 0.767 ms (10.8x) (cherry picked from commit 1d0a4a8)
The king-danger pass vmapped _is_attacked over LEGAL_DEST[KING, king_pos], which is padded to 27 (the queen's maximum). A king has at most 8 moves, so 19 lanes were always -1 yet each still ran a full _is_attacked probe. Slice to [:8]. Output identical: perft 20/400/8902; legal-mask matches the prior implementation across 2396 random-playout positions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR replaces the make-move legality filter in chess
_legal_action_mask(up to 200x_apply_move+_is_checkedper position) with the classic checkers/pins/king-danger scheme used by conventional engines, computed once per position:RAYS/RAY_DIR/IS_DIAG_DIRtables)After these masks are built, every pseudo-legal move is decided by table lookups — no per-move make/unmake. En passant (at most two candidates, with its well-known edge cases) keeps the exact make-move test. Castling and underpromotion logic are unchanged.
_is_attackednow takes the board array instead ofGameStateso it can run on the king-removed board.is_terminalnow reuseshash_history[0](already written by_update_history) instead of recomputing the Zobrist hash.Fixes a silent correctness bug
The old code compacted candidate moves through
jnp.nonzero(..., size=200), silently dropping legal moves in positions with more than 200 pseudo-legal candidates. Example: the known 218-legal-move positionreturned exactly 200 legal actions (18 dropped, no error). Multi-queen positions of this kind do occur in self-play RL, where the policy then trains on a wrong action mask. With this PR there is no cap anymore.
Validation
tests/diff_vs_python_chess.py(included, standalone tool — intentionally nottest_*so it doesn't extend regular CI): differential test against python-chess as ground truth.Zero mismatches.
Benchmark
jax.jit(jax.vmap(...)), batch 2048, CUDA (NVIDIA GB10):_legal_action_maskGame.stepThe new code only uses gather/scatter/
where/argmaxprimitives, so it is also friendly to non-CUDA backends (cf. #1317).Relation to other work
main.chess/bb) and [Chess] Useint8#1259 (chess/int8); this approach keeps the existingint32mailbox representation and data layout, so observation/Zobrist/history code is untouched.