Skip to content

feat(M88): Framework-Level Inference Hooks#189

Merged
hlin99 merged 1 commit into
mainfrom
feat/m88-inference-hooks
Apr 6, 2026
Merged

feat(M88): Framework-Level Inference Hooks#189
hlin99 merged 1 commit into
mainfrom
feat/m88-inference-hooks

Conversation

@hlin99

@hlin99 hlin99 commented Apr 6, 2026

Copy link
Copy Markdown
Member

Summary

Implements framework-level inference hooks for capturing intermediate states during inference, enabling root-cause analysis of PD divergence by comparing intermediate representations.

Changes

  • inference_hooks.py: Core module with:
    • InferenceHook protocol: on_prefill, on_kv_transfer, on_decode_step
    • HookCapture, StageComparison, TraceResult dataclasses
    • MockInferenceHook for testing with configurable noise scale
    • compare_captures(): field-by-field comparison (max/mean abs diff, cosine similarity)
    • run_trace(): orchestrates hook-based comparison across pipeline stages
    • format_trace(): rich terminal output with per-stage comparison table
  • CLI: xpyd-acc trace subcommand with --baseline, --target, --prompt, --hooks, --mock, --json, --threshold
  • 34 tests covering hook protocol, captures, comparison, trace, formatting, JSON export, CLI integration

Closes #188

- InferenceHook protocol: on_prefill, on_kv_transfer, on_decode_step
- HookCapture, StageComparison, TraceResult dataclasses
- MockInferenceHook for testing with configurable noise
- compare_captures() for field-by-field comparison (max/mean diff, cosine sim)
- run_trace() orchestrates hook-based comparison across stages
- format_trace() for rich terminal output with per-stage table
- xpyd-acc trace CLI subcommand with --mock, --json, --hooks, --threshold
- 34 tests covering protocol, captures, comparison, trace, formatting, CLI

Closes #188

@hlin99-Review-Bot hlin99-Review-Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved (hlin99-Review-Bot)

Idea Value: High — framework-level inference hooks fill a critical gap for root-cause analysis of PD divergence. The hook protocol + mock pattern enables testing without live endpoints.

Code Quality: Clean.

  • Well-defined protocol (InferenceHook) with clear hook points
  • Solid dataclass hierarchy with proper serialization
  • compare_captures() handles edge cases (shape mismatch, None fields)
  • 34 tests with good coverage (protocol, captures, comparison, trace, formatting, JSON, CLI)
  • CI all green (lint + tests on 3.10/3.11/3.12)
  • docs/iterations/current.md updated

LGTM.

@hlin99-Review-BotX hlin99-Review-BotX left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved (hlin99-Review-BotX)

Idea Value: High — inference hooks are essential for pinpointing where PD divergence originates. The protocol-based design with clear hook points (prefill/kv_transfer/decode_step) is the right abstraction.

Code Quality: Clean.

  • InferenceHook protocol is well-defined with runtime_checkable
  • compare_captures() handles shape mismatches and None fields correctly
  • _cosine_sim properly handles zero-norm edge case
  • run_trace() cleanly orchestrates the comparison pipeline
  • Mock hook with configurable noise enables deterministic testing
  • 34 tests, CI green on 3.10/3.11/3.12
  • docs/iterations/current.md updated

Second approval — should auto-merge. LGTM.

@hlin99 hlin99 merged commit 0b11aa6 into main Apr 6, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(M88): Framework-Level Inference Hooks

3 participants