Skip to content

Adjoint/transpose matvec allocates per call (forward path is alloc-free) — hurts the :iterative lstsq loop #12

@ChrisRackauckas-Claude

Description

@ChrisRackauckas-Claude

WHAT: _adjoint_matvec! allocates a fresh length-r scratch w = Vector{...}(undef, r) on every call (src/matvec.jl:152). Confirmed by reading the source. Measured 96-128 B/call vs 0 B for the forward mul!. Likewise the cached LS solve _structured_apply! (src/lstsq.jl:247-263) allocates the temporaries F.W'*b, F.QL'*b, F.Msp*Wtc (256 B/solve) despite the struct carrying a length-n cbuf — every other ldiv! (Woodbury/augmented/QR) measures exactly 0. WHY IT MATTERS: The adjoint matvec is what the :iterative (LSQR/LSMR) engine drives every iteration, so an iterative solve allocates O(iterations); the cached LS solve is sold for hot Newton loops. Both break the API's allocation-free contract. FIX: (a) Hoist the length-r adjoint scratch into a reusable/size-guarded buffer (for SelectorMatrix U the common case, w is just u[1:r] and needs no allocation at all). (b) Add s-length and r-length scratch fields to the LS struct and mul! into them, mirroring the Woodbury buffer approach. EFFORT: S (selector matvec) / M (general + LS struct).


Priority: medium. Filed from an automated next-steps audit of the QR/lstsq work (see PR #6).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions