Add lstsq: minimum-norm least squares (A⁺b) for rank-deficient / inconsistent systems#5
Add lstsq: minimum-norm least squares (A⁺b) for rank-deficient / inconsistent systems#5ChrisRackauckas-Claude wants to merge 3 commits into
Conversation
…nsistent systems `lstsq(A, b)` returns the Moore–Penrose minimum-norm least-squares solution, correct for rank-deficient or inconsistent A where `\`/`factorize`/`qr` throw SingularException. A separate, opt-in entry point — the direct solves stay exact and keep throwing. Why not the bordered/Woodbury path: solving [S U; V -I][x;y]=[b;0] in least squares minimizes a different objective than ‖Ax−b‖ (measured residual 3–65× too large on rank-deficient inconsistent systems), and there is no Woodbury formula for the pseudoinverse of a sum. So this works on A directly. Two engines: - :dense (default) — LAPACK gelsy (complete-orthogonal decomposition) of the densified A. Exact (matches pinv(Matrix(A))*b to machine precision), O(n³)/O(n²). A = S+UV is always fully dense, so a direct solve cannot exploit the sparsity. - :iterative — LSQR (default) / LSMR driven by A's structured matvec and adjoint, never forming the dense A; for large n. Converges to the same min-norm solution from x0=0. Lives in an IterativeSolvers extension (weakdep). Also adds a structured adjoint/transpose matvec to core (Aᴴu = Sᴴu + Vᴴ(Uᴴu)) — a real fix, since A'*u previously fell back to a ~1000× slower generic getindex path. This lets the iterative engine drive A directly (no LinearMaps/LinearOperators dependency). Engine choice: IterativeSolvers, not Krylov/LinearSolve's KrylovJL_LSMR. Measured: with a correct adjoint, IterativeSolvers lsqr/lsmr reach ~1e-12 min-norm accuracy while Krylov lsmr/lsqr halt early at ~1e-6 (and tightening tolerances does not help). Both converge — Krylov is not "broken" — but IterativeSolvers is ~6 orders tighter, which matters here. Adversarial review found and fixed: (1) lsqr's `isconverged` is unreliable (reports success at the iteration cap with a wrong answer) — now flag on budget exhaustion and report the true LS residual; (2) negative λ silently treated as |λ| by the dense path — now rejected; (3) Int/Rational slipped the eltype guard (checked the float-promoted type) — now checks the raw promoted type. Regression tests for each. Stacked on #4 (the QR PR): test_lstsq.jl asserts qr(A) throws SingularException while lstsq succeeds, so it depends on the QR entry point. 390/390 tests pass on Julia 1.10, 1.11, and 1.12. Runic-formatted, typos clean. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
…:auto the default Adds `alg=:structured` and makes `alg=:auto` the default for `lstsq`. When S is nonsingular, A = S(I + ZV) with Z = S⁻¹U, and the entire rank deficiency collapses into the small r×r capacitance C = I + V Z (nullity(A) = nullity(C), since det A = det S · det C). So A⁺b is assembled from ONE sparse factorization of S (PureKLU) + r solves for Z + small dense SVD/QR on r×r and n×s (s ≤ 2r) blocks — A is NEVER formed or factored densely. O(factor(S) + n·r²); measured 191× faster than the dense COD at n=2000, r=6. Handles consistent AND inconsistent b (a rank-k projection of b onto range(A) via the left-null basis S⁻ᴴVᴴnull(Cᴴ)). :auto uses :structured when S is nonsingular & well-conditioned, else the exact :dense COD. The structured route inverts S, so it is guarded two ways and falls back on failure: (1) klu throws on exactly-singular S; (2) near-singular S is caught by a κ̂(S)=‖S‖₁‖Z‖/‖U‖ estimate (klu does NOT throw there — it silently returns garbage Z, verified) plus a post-hoc least-squares-optimality check. `alg=:structured` (forced) errors rather than return a wrong answer; `alg=:auto` falls back to :dense. Every formula was derived and verified against the pinv(Matrix(A))*b oracle before implementation (consistent/inconsistent × rank deficiency k=1,2,3 × dense-U/SelectorMatrix-U × complex): relerr ~1e-15, min-norm + LS-optimality + null-space-free certificates. The inconsistent-case projection is essential — the naive M⁺S⁻¹b shortcut is 23% wrong on inconsistent b (the LS objective ‖Ax−b‖=‖S(Mx−S⁻¹b)‖ is S-weighted). 581/581 tests pass on Julia 1.10, 1.11, and 1.12. Runic-formatted, typos clean. No new deps (reuses the existing PureKLU dependency for the S factorization). Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
|
Major update (699afb6): added a structure-exploiting DIRECT engine and made it the default.
Guarded two ways (klu throws on exactly-singular Every formula was derived and verified against the Note on the singular- Will refresh the PR description once that lands. |
…dles the BVP case When S is singular because its boundary rows are zeroed and supplied by a SelectorMatrix U (the `replace=true` / almost-banded case), the structured engine previously fell back to the dense COD. Add a sparse "peel": A = S + UV = (S + U Uᴴ) + U(V − Uᴴ), and for a selector U the term U Uᴴ is just r diagonal entries, so S̃ = S + U Uᴴ stays SPARSE and is typically nonsingular (the zeroed rows get a unit diagonal back), with the rank unchanged. The capacitance method then applies to (S̃, U, V − Uᴴ) — a structure-exploiting DIRECT solve, no densifying. Verified against pinv(Matrix(A))*b: ~2-3e-15 for A nonsingular AND rank-deficient, consistent AND inconsistent b. A general singular S whose null space is NOT coordinate-aligned has a dense de-singularizing correction (no sparse peel exists), so it still falls back to the exact dense COD — a genuine mathematical limit, caught by `_peel_selector` returning nothing. 617/617 tests pass on Julia 1.10, 1.11, and 1.12. Runic-formatted, typos clean. Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
|
Follow-up (4abc4fc): the structured direct engine now handles the singular- When A general singular So the structure-exploiting direct method now covers: |
Add
lstsq: minimum-norm least squares (A⁺b) for rank-deficient / inconsistent systems\,factorize/lu, andqrsolve nonsingular systems and throwSingularExceptionon a singularA. This addslstsq(A, b)— the minimum-norm least-squares solutionx = A⁺b(the Moore–Penrose solution: of allxminimizing‖Ax−b‖₂, the one of smallest‖x‖₂) — correct for rank-deficient and inconsistent systems. It is a separate, opt-in entry point; the direct solves are unchanged and keep throwing.Why not the bordered/Woodbury machinery
The prior QR work established (and #4's discussion shows) that solving the bordered system
[S U; V -I][x;y]=[b;0]in least squares minimizes a different objective than‖Ax−b‖— measured residual 3–65× too large on rank-deficient inconsistent systems — and there's no Woodbury formula for the pseudoinverse of a sum. Solstsqworks onAdirectly.Two engines
:dense(default)gelsy(complete-orthogonal decomposition) of the densifiedApinv(Matrix(A))*bto ~1e-15O(n³)/O(n²)memn~ few thousand:iterativeA's structured matvec + adjoint, never forming denseAatol/btol(~1e-12); fragile if ill-conditionedO(nnz(S)+nr)nA = S + U·Vis always fully dense (the rank-router product fills every entry), so a direct solve genuinely can't exploit the sparsity — the iterative path is the structure-exploiting one for largen.Structured adjoint matvec (a real fix, beyond LS)
Added
mul!(y, A', u)/Aᴴ/Aᵀto core computingAᴴu = Sᴴu + Vᴴ(Uᴴu). PreviouslyA'*ufell back to LinearAlgebra's genericgetindexpath — ~1000× slower (it materializes columns ofA). This is what lets the iterative engine driveAdirectly, with noLinearMaps/LinearOperatorsdependency.Engine choice: IterativeSolvers, not Krylov /
KrylovJL_LSMRI measured this directly rather than route through LinearSolve's
KrylovJL_*(which would need no new dep). With a correct adjoint, both libraries converge — but IterativeSolverslsqr/lsmrreach ~1e-12 min-norm accuracy while Krylovlsmr/lsqrhalt early at ~1e-6 (tighteningatol/rtoldoesn't help — Krylov stops on a different internal criterion). So Krylov is not broken (an internal review draft had claimed a catastrophic 1.92 error — that was an adjoint-wrapper artifact, corrected), but IterativeSolvers is ~6 orders tighter, which is the right call for a least-squares primitive. IterativeSolvers is a light weakdep behind an extension; the LinearSolve extension is untouched.Adversarial review found and fixed
lsqrsilent wrong answers — IterativeSolvers'isconvergedis unreliable forlsqr(reports success even at the iteration cap with a garbage result; reproduced err=6.4 at maxiter=3). Now flag on budget exhaustion (iters ≥ maxiters, which cleanly separates converged/not for both solvers) and report the true LS residual‖Aᴴ(Ax−b)‖.λsilently treated as|λ|by the dense path (rejected by iterative) → both now validateλ ≥ 0.Int/Rationalslipped through because the check ran on thefloat-promoted type → now checks the rawpromote_type.Regression tests added for each. The review independently confirmed correctness across higher rank deficiency (n−3/n−4),
r==0, SelectorMatrix-U, complex (adjoint ≠ transpose), mixed precision, Tikhonov, and the 5-arg adjointmul!— all matching thepinvoracle.Tests / CI
test/test_lstsq.jl(~90 assertions): adjoint/transpose matvec; dense & iterative min-norm vspinv(consistent + inconsistent, LS-optimality + null-space-free certificates); SelectorMatrix-U; full-rank reduces to\; complex; Tikhonov; the "lstsqsucceeds where\/factorize/qrthrow" separation; eltype/arg/λ/convergence-warning guards.IterativeSolvers(weakdep, extension-gated — core load unchanged).🤖 Generated with Claude Code
Co-Authored-By: Chris Rackauckas accounts@chrisrackauckas.com