Commit 3ee28c7
committed
Refactor: Qwen3 decode with 3-scope architecture and TILELET rename
- qwen3_32b_decode.py: Refactored into 3 scopes for better incore
* Scope 1: Input RMSNorm + Q/K/V projection
* Scope 2: Attention (K RoPE + cache, QK matmul, softmax, SV matmul)
* Scope 3: Output projection, residual, RMSNorm, MLP
- Updated HIDDEN size from 5120 to 8192 (64 heads × 128 dim)
- Renamed qwen3_32b_decode_tilelet.py to qwen3_32b_decode_mixed.py
for clearer TILELET-aware version naming
- Adjusted tiling constants for each scope1 parent 0d48e70 commit 3ee28c7
2 files changed
Lines changed: 647 additions & 312 deletions
0 commit comments