Skip to content

Store code fix for large macrotile#37

Open
b-shi wants to merge 1 commit into
subtile_mxfrom
subtile_mx_storefix2
Open

Store code fix for large macrotile#37
b-shi wants to merge 1 commit into
subtile_mxfrom
subtile_mx_storefix2

Conversation

@b-shi

@b-shi b-shi commented Apr 24, 2026

Copy link
Copy Markdown
Owner

Summary

  • Fix dropped stores in the beta != 0 path for 16-bit subtile paired-store kernels when numElementsPerBatch is not aligned to MIWaveTile[0]
  • When VGPR pressure (from C-load staging registers) reduces the batch size, batch boundaries can split sba=0/sba=1 element pairs. The sba=1 element at the start of a new batch had no partner and silently emitted no store. Added an sba=1 orphan scalar store path mirroring the existing sba=0 orphan handling.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant