WHAT: _adjoint_matvec! allocates a fresh length-r scratch w = Vector{...}(undef, r) on every call (src/matvec.jl:152). Confirmed by reading the source. Measured 96-128 B/call vs 0 B for the forward mul!. Likewise the cached LS solve _structured_apply! (src/lstsq.jl:247-263) allocates the temporaries F.W'*b, F.QL'*b, F.Msp*Wtc (256 B/solve) despite the struct carrying a length-n cbuf — every other ldiv! (Woodbury/augmented/QR) measures exactly 0. WHY IT MATTERS: The adjoint matvec is what the :iterative (LSQR/LSMR) engine drives every iteration, so an iterative solve allocates O(iterations); the cached LS solve is sold for hot Newton loops. Both break the API's allocation-free contract. FIX: (a) Hoist the length-r adjoint scratch into a reusable/size-guarded buffer (for SelectorMatrix U the common case, w is just u[1:r] and needs no allocation at all). (b) Add s-length and r-length scratch fields to the LS struct and mul! into them, mirroring the Woodbury buffer approach. EFFORT: S (selector matvec) / M (general + LS struct).
Priority: medium. Filed from an automated next-steps audit of the QR/lstsq work (see PR #6).
WHAT:
_adjoint_matvec!allocates a fresh length-r scratchw = Vector{...}(undef, r)on every call (src/matvec.jl:152). Confirmed by reading the source. Measured 96-128 B/call vs 0 B for the forwardmul!. Likewise the cached LS solve_structured_apply!(src/lstsq.jl:247-263) allocates the temporariesF.W'*b,F.QL'*b,F.Msp*Wtc(256 B/solve) despite the struct carrying a length-n cbuf — every other ldiv! (Woodbury/augmented/QR) measures exactly 0. WHY IT MATTERS: The adjoint matvec is what the:iterative(LSQR/LSMR) engine drives every iteration, so an iterative solve allocates O(iterations); the cached LS solve is sold for hot Newton loops. Both break the API's allocation-free contract. FIX: (a) Hoist the length-r adjoint scratch into a reusable/size-guarded buffer (for SelectorMatrix U the common case, w is just u[1:r] and needs no allocation at all). (b) Add s-length and r-length scratch fields to the LS struct andmul!into them, mirroring the Woodbury buffer approach. EFFORT: S (selector matvec) / M (general + LS struct).Priority: medium. Filed from an automated next-steps audit of the QR/lstsq work (see PR #6).