Skip to content
View yifu-ding's full-sized avatar

Highlights

  • Pro

Block or report yifu-ding

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. Awesome-Edge-LLMs Awesome-Edge-LLMs Public

    This is a repository accompanying the survey Edge AI Meets LLM (coming soon), containing a comprehensive list of papers, codebases, toolchains, and open-source frameworks. It is intended to serve a…

    15 2

  2. BGEMM-CUDA BGEMM-CUDA Public

    BGEMM-CUDA is a CUDA-based low-bit GEMM kernel library for efficient neural network inference. It implements optimized binary and ternary matrix multiplication primitives, including binary-weight a…

    Cuda 20 2

  3. DPTS DPTS Public

    Official implementation of Dynamic Parallel Tree Search for accelerating LLM reasoning with test-time parallel search.

    Python 4 2

  4. MoE-Slimming MoE-Slimming Public

    Official ICML 2026 Spotlight implementation for structural MoE compression, including attribution-guided channel scoring, coverage-maximized pruning, compact checkpoint construction, and fine-tunin…

    Python

  5. MP-Sparse-Attn MP-Sparse-Attn Public

    MP-Sparse-Attn provides Triton kernels for Diagonal-Tiled Mixed-Precision Attention, targeting efficient low-bit MXFP inference for Transformer models. It combines tile-level mixed-precision comput…

    Python 2