Skip to content

[Chess] Stockfish-style legal move generation: ~11x faster step, fixes silent move truncation#1318

Open
gweber wants to merge 2 commits into
sotetsuk:mainfrom
gweber:chess-fast-movegen
Open

[Chess] Stockfish-style legal move generation: ~11x faster step, fixes silent move truncation#1318
gweber wants to merge 2 commits into
sotetsuk:mainfrom
gweber:chess-fast-movegen

Conversation

@gweber

@gweber gweber commented Jun 10, 2026

Copy link
Copy Markdown

Summary

This PR replaces the make-move legality filter in chess _legal_action_mask (up to 200x _apply_move + _is_checked per position) with the classic checkers/pins/king-danger scheme used by conventional engines, computed once per position:

  • Checker detection around the king (near pieces + sliders with a clear path)
  • Check-evasion target mask: non-king moves must capture the single checker or block its line; empty on double check
  • Absolute pin rays: the first own piece along each of the 8 king rays, backed by a matching enemy slider, is restricted to moves along that ray (new RAYS / RAY_DIR / IS_DIAG_DIR tables)
  • King-danger squares evaluated with the king lifted off the board, so sliders keep attacking through the vacated square

After these masks are built, every pseudo-legal move is decided by table lookups — no per-move make/unmake. En passant (at most two candidates, with its well-known edge cases) keeps the exact make-move test. Castling and underpromotion logic are unchanged. _is_attacked now takes the board array instead of GameState so it can run on the king-removed board.

is_terminal now reuses hash_history[0] (already written by _update_history) instead of recomputing the Zobrist hash.

Fixes a silent correctness bug

The old code compacted candidate moves through jnp.nonzero(..., size=200), silently dropping legal moves in positions with more than 200 pseudo-legal candidates. Example: the known 218-legal-move position

R6R/3Q4/1Q4Q1/4Q3/2Q4Q/Q4Q2/pp1Q4/kBNN1KB1 w - - 0 1

returned exactly 200 legal actions (18 dropped, no error). Multi-queen positions of this kind do occur in self-play RL, where the policy then trains on a wrong action mask. With this PR there is no cap anymore.

Validation

tests/diff_vs_python_chess.py (included, standalone tool — intentionally not test_* so it doesn't extend regular CI): differential test against python-chess as ground truth.

  • 17 adversarial FENs (kiwipete, perft positions 3–6, en-passant pins, ep capture of a checking pawn, castling through attacked squares, double check, pinned-knight zugzwang, the 218-move position) with full depth-2 expansion (every child of every legal move compared again)
  • 20 random games, every position compared move-by-move

Zero mismatches.

Benchmark

jax.jit(jax.vmap(...)), batch 2048, CUDA (NVIDIA GB10):

before after speedup
_legal_action_mask 7.505 ms 0.535 ms 14.0x
full Game.step 8.303 ms 0.767 ms 10.8x

The new code only uses gather/scatter/where/argmax primitives, so it is also friendly to non-CUDA backends (cf. #1317).

Relation to other work

gweber added 2 commits June 10, 2026 14:49
Replace the make-move legality filter (200x _apply_move + _is_checked per
position) with precomputed masks, computed once per position:

- checker detection around the king (near pieces + sliders with clear path)
- check-evasion target mask (capture the single checker or block its line;
  empty on double check)
- absolute pin rays: first own piece along each of the 8 king rays, backed
  by a matching enemy slider, restricted to moves along that ray (new
  RAYS/RAY_DIR/IS_DIAG_DIR tables)
- king-danger squares evaluated with the king lifted off the board, so
  sliders attack through the vacated square

Every pseudo-legal move is now decided by table lookups. En passant (at most
two candidates, nasty edge cases) keeps the exact make-move test. Castling
and underpromotion logic are unchanged. _is_attacked now takes the board
array instead of GameState.

This also fixes a silent correctness bug: the old code compacted candidate
moves through jnp.nonzero(..., size=200), silently dropping legal moves in
positions with more than 200 pseudo-legal candidates (e.g. multi-queen
positions reached in self-play; the known 218-move position returned only
200). There is no cap anymore.

is_terminal now reuses hash_history[0] (written by _update_history) instead
of recomputing the Zobrist hash.

Validation (tests/diff_vs_python_chess.py, python-chess as ground truth):
17 tricky FENs (kiwipete, perft 3-6, ep pins, castling through check,
double check, 218-move position) with full depth-2 expansion plus 20 random
games - zero mismatches.

Benchmark (RTX/GB10, batch 2048, vmap+jit):
  legal_action_mask  7.505 ms -> 0.535 ms  (14.0x)
  full step          8.303 ms -> 0.767 ms  (10.8x)

(cherry picked from commit 1d0a4a8)
The king-danger pass vmapped _is_attacked over LEGAL_DEST[KING, king_pos],
which is padded to 27 (the queen's maximum). A king has at most 8 moves, so
19 lanes were always -1 yet each still ran a full _is_attacked probe. Slice
to [:8].

Output identical: perft 20/400/8902; legal-mask matches the prior
implementation across 2396 random-playout positions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant