Skip to content

ci(benchmark): upgrade Kimi K2.5 to K2.6#781

Open
carlushuang wants to merge 1 commit into
mainfrom
carhuang/update_kimi_k2.6
Open

ci(benchmark): upgrade Kimi K2.5 to K2.6#781
carlushuang wants to merge 1 commit into
mainfrom
carhuang/update_kimi_k2.6

Conversation

@carlushuang
Copy link
Copy Markdown
Contributor

Summary

Replace Kimi-K2.5-MXFP4 with Kimi-K2.6-MXFP4 across all CI configs.

Same architecture (KimiK25ForConditionalGeneration, model_type: kimi_k25), better-trained weights. No model code changes needed.

Changes

Old New HF Path
Kimi-K2.5-MXFP4 Kimi-K2.6-MXFP4 amd/Kimi-K2.6-MXFP4

Accuracy baseline from HF card: gsm8k=0.9393 (BF16), MXFP4=0.9318 (99.2% recovery).

Files updated:

  • .github/benchmark/models.json
  • .github/benchmark/models_accuracy.json
  • .github/benchmark/oot_models_accuracy.json

Test plan

  • Architecture verified: same KimiK25ForConditionalGeneration class
  • Nightly accuracy validation (will run after merge)

Replace Kimi-K2.5-MXFP4 with Kimi-K2.6-MXFP4 across benchmark and accuracy
configs. Also update Eagle3 draft model to kimi-k2.6-eagle3.

Same architecture (KimiK25ForConditionalGeneration), better weights.
Accuracy baseline from HF amd/Kimi-K2.6-MXFP4: gsm8k=0.9393 (BF16),
MXFP4=0.9318 (99.2% recovery). Quantized by AMD via Quark.

Note: K2.6-MXFP4 requires ATOM with PR#744 (MXFP4 shuffle alignment)
and latest aiter with topk_gating.
@carlushuang carlushuang force-pushed the carhuang/update_kimi_k2.6 branch from 814a70a to ee394bb Compare May 14, 2026 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants