Skip to content

fix: wait per-layer on drafter KV pool during cpu cache loadback#6

Open
LorrinWWW wants to merge 4 commits into
mainfrom
jue/fix-spec-indices-on-new-main
Open

fix: wait per-layer on drafter KV pool during cpu cache loadback#6
LorrinWWW wants to merge 4 commits into
mainfrom
jue/fix-spec-indices-on-new-main

Conversation

@LorrinWWW
Copy link
Copy Markdown
Contributor

Summary

Drafter attention was reading uninitialized KV from in-flight cpu cache loads because the drafter pool was missing register_layer_transfer_counter (so get_key_buffer never waited per-layer); also reorders the draft-index H2D before the load_stream start_event so load_stream can't consume not-yet-ready indices.

Test Plan

@LorrinWWW LorrinWWW requested a review from a team as a code owner May 6, 2026 17:43
@LorrinWWW LorrinWWW requested a review from XucSh May 7, 2026 14:33
@LorrinWWW LorrinWWW self-assigned this May 8, 2026
@LorrinWWW LorrinWWW marked this pull request as draft May 8, 2026 23:19
@LorrinWWW LorrinWWW changed the title fix: wait per-layer on drafter KV pool during cpu cache loadback [WIP] fix: wait per-layer on drafter KV pool during cpu cache loadback May 8, 2026
@LorrinWWW LorrinWWW marked this pull request as ready for review May 9, 2026 20:18
@LorrinWWW LorrinWWW changed the title [WIP] fix: wait per-layer on drafter KV pool during cpu cache loadback fix: wait per-layer on drafter KV pool during cpu cache loadback May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant