forked from antirez/ds4
-
Notifications
You must be signed in to change notification settings - Fork 0
Pull requests: TrevorS/ds4
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
mtp: combined-forward speculative decode beats plain on GB10 (+2.4 t/s) (stacked on #11)
#12
opened May 24, 2026 by
TrevorS
Owner
Loading…
cuda: DGX Spark / GB10 backend support — HBM-resident model
#11
opened May 24, 2026 by
TrevorS
Owner
Loading…
cli: default mtp_draft_tokens=2; cuda: post-stack cleanup (stacked on #9)
#10
opened May 24, 2026 by
TrevorS
Owner
Loading…
cuda: F16 share-warp kernel for n_tok=2 combined-forward verifier (stacked on #8)
#9
opened May 24, 2026 by
TrevorS
Owner
Loading…
cuda: make share-warp Q8 kernel bit-equal at any block count (stacked on #7)
#8
opened May 24, 2026 by
TrevorS
Owner
Loading…
cuda: replace K4 cuBLAS-cache gate with strict-mode gate (stacked on #6)
#7
opened May 24, 2026 by
TrevorS
Owner
Loading…
mtp: combined-forward default + Option-B strict fallback (stacked on #5)
#6
opened May 24, 2026 by
TrevorS
Owner
Loading…
cuda: DS4_CUDA_STRICT_BATCHED — bit-equal batched-N infrastructure (stacked on #4)
#5
opened May 24, 2026 by
TrevorS
Owner
Loading…
mtp: ACCEPT_REPORT instrumentation + SPEC_DISABLE refactor (stacked on #3)
#4
opened May 23, 2026 by
TrevorS
Owner
Loading…
cuda: revive 2 dropped kernels with FMA-contraction fixes (stacked on #2)
#3
opened May 23, 2026 by
TrevorS
Owner
Loading…
cuda: small-N batched kernel polish (stacked on #1)
#2
opened May 23, 2026 by
TrevorS
Owner
Loading…
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.