Description Stage 1: Attention Visualization
Add NF_MSG_ATTENTION_PATTERN packet type (token→token edges per head)
Implement attention matrix sampling in C++ probe
Frontend: render token graphs with edge weights (not neuron scatter)
Target: visualize multi-head attention without full connectivity assumption
Stage 2: Performance Profiling
Measure actual training overhead (currently untested in real training loop)
Add configurable backpressure (drop-oldest vs drop-newest)
Benchmark ring buffer contention under high-frequency hooks
Target: validate <5% overhead claim with real models
Stage 3: MoE Support
Add expert routing packet (NF_MSG_EXPERT_ROUTING)
Track per-expert utilization and load balancing
Frontend: expert utilization heatmap per layer
Target: debug MoE expert collapse during training
Stage 4: Production Hardening
Implement packet versioning mismatch handling (currently no graceful fallback)
Add reconnection logic with state sync (NF_OP_STATE_SNAPSHOT exists but unused)
Clean shutdown on Python interpreter exit (currently commented out)
Add CI/CD for build matrix (Python 3.9-3.12, CUDA/CPU variants)
Stage 5: Multi-Client & Recording
Server-side packet recording to disk (binary log replay)
Multi-client broadcast validation (currently untested)
Playback mode: load recorded session without training loop
Target: offline analysis and debugging
Stage 6: Documentation & Examples
Real PyTorch hook example (not just simulator)
Transformer-specific integration guide (Hugging Face, NanoGPT)
Video walkthrough of debugging a training run
Protocol specification as standalone doc
Reactions are currently unavailable
You can’t perform that action at this time.
Stage 1: Attention Visualization
Stage 2: Performance Profiling
Stage 3: MoE Support
Stage 4: Production Hardening
Stage 5: Multi-Client & Recording
Stage 6: Documentation & Examples