Status: canonical
PLIB-214/#3729reference record, updated 2026-03-16 after landing the first bounded train-class gradient-scaling report incrates/psionic-train/src/mixed_precision.rs.
This document records the current bounded gradient-scaling semantics surface for Psionic.
Run the gradient-scaling harness from the repo root:
scripts/release/check-psionic-gradient-scaling-semantics.shpsionic-train now exposes:
GradientScalingModeGradientScalingSignalGradientScalingDiagnosticTrainingGradientScalingPolicyGradientScalingDecisionGradientScalingCaseResultGradientScalingSemanticsReportbuiltin_gradient_scaling_semantics_report()
Today Psionic has a first-class typed train-class gradient-scaling surface, but it does not claim broad mixed-precision training closure across all backends or precision families.
The bounded seeded surface now makes these seams explicit:
- dynamic loss scaling for the current fp16 train path
- overflow handling that backs off the scale and skips the optimizer step
- underflow handling that grows the scale instead of silently accepting vanishing gradients
- an explicit bf16 no-scaling posture
- typed refusal when the bounded surface lacks fp32 master weights or receives unsupported gradient precisions
This report prevents two failure modes:
- claiming "mixed-precision training works" because autocast exists while loss scaling remains implicit
- silently masking overflow or underflow behavior inside one trainer loop instead of publishing a reusable contract
The point of this issue is to make train-class mixed-precision step behavior machine-legible so later quantization, distributed data-feed, and export work can build on one explicit contract.