Inference-time LLM structured pruning via probe-based representation-parameter coupling (ICML 2026)
-
Updated
Jun 17, 2026 - Python
Inference-time LLM structured pruning via probe-based representation-parameter coupling (ICML 2026)
Add a description, image, and links to the inferencetime topic page so that developers can more easily learn about it.
To associate your repository with the inferencetime topic, visit your repo's landing page and select "manage topics."