MILC interface: Exact current by leonhostetler · Pull Request #1633 · lattice/quda

leonhostetler · 2026-05-26T18:03:56Z

This is the second and final PR in the series to enable exact current calculations for HVP in the MILC/QUDA interface. The previous PR was #1590 .

Changes:

Adds a new MILC interface function, qudaExactCurrent, in lib/milc_interface.cpp. This computes the exact low-mode contribution to the current densities needed for HVP current calculations.
Implements eigenvalue shifting in the MILC interface for preserved deflation spaces. Previously, when using a preserved deflation space across multiple solves, the eigenvalues were recomputed each time the mass changed. With this update, computeEvals is called only once, during the first solve. The resulting zero-mass eigenvalues are cached and shifted as needed for subsequent masses.

This avoids many unnecessary mat-vecs and greatly reduces output volume: eigenvalues are now printed only once instead of once per solve, which previously could produce O(10^6) lines of eigenvalue output in a typical production run.
Makes lib/eig_block_trlm.cpp and lib/eig_trlm.cpp slightly more verbose by printing progress notifications every 10th or 100th Lanczos step. On production-size lattices, eigensolves can otherwise run for more than an hour without any visible progress, which can make a normally progressing job appear to be stalled.

Testing:

Verified exact-current results against the corresponding MILC-side implementation.
Verified that preserved deflation spaces reuse cached zero-mass eigenvalues and shift them correctly for subsequent masses.

…ases

…appy

…als calls

maddyscientist

This looks good to me. I'll leave @weinbe2 to comment on the MILC interface specifics. I've left a couple of comments.

maddyscientist · 2026-05-27T17:07:49Z

Thanks for the tweaks @leonhostetler. I've left a couple of non-essential comments that might be nice to update if you make another push.

@weinbe2 any thoughts before we merge?

weinbe2 · 2026-06-02T01:21:45Z

Thanks for the PR, @leonhostetler , this largely looks good. I have a few questions/thoughts:

Resetting the eigenvalue mass shift to the sentinel value of -1 seems a bit fragile still---qudaCleanupDeflationSpace() isn't triggered with a call to invalidateGaugeQuda(), or a fresh gauge field is loaded, for example. Maybe this is semi-intentional for heavy quarks where you want to plug in the heavy-quark-modified gauge fields, even if the eigenvectors technically change a bit, but maybe in that case we'd want a different routine for uploading those that doesn't invalidate the eigenvalue mass shift. I guess these considerations apply to the eigenvectors as well, so I should have considered this in an earlier PR.
Did you try this with a PRECISION=1 MILC build? I see you put in the proper plumbing to set the host precision in qudaExactCurrent, this is just making sure.

No other feedback, if you've verified that this works with the current workflow in MILC I trust you. It could be useful if you documented this on the QUDA Wiki in case other people wanted to use this, but the PR doesn't need to be blocked on that.

…from a single precision MILC run

…sion

…igenvector D2H Replace the per-eigenvector host post-processing in qudaExactCurrent with device-side accumulation. The old loop copied each staggered contraction (one complex/site) to the host and summed scaled imaginary parts on the CPU -- ~4*n_evec*2 tiny latency-bound D2H transfers, plus hand reinterpret_cast of a raw void* buffer. Instead, wrap the two halves of the contraction output (even/odd parity) as single-parity nColor=1, nSpin=1 reference fields and sum the scaled complex contractions on the device with blas::axpy. The imaginary part is extracted once at the end into MILC's interleaved jlow arrays, reducing the device->host traffic to 8 (or 16) larger reads.

leonhostetler · 2026-06-06T21:53:02Z

Thanks for reviewing this, @weinbe2 !

Resetting the eigenvalue mass shift to the sentinel value of -1 seems a bit fragile still

I agree. On the one hand, in practice, I don't know that we have any workflows where we load a completely different gauge config during a run. In principle though, we should probably guard against that possibility. As for the heavy quarks, yes, I figured that since the eigenvectors from the non-Naik'ed operator were already being used as an approximation to the real eigenvectors, then why not the eigenvalues as well? For deflated CG, I figured/assumed this would be fine in that they should be a good approximation and the CG should correct any errors. For other uses of the eigenvalues (as in qudaExactCurrent where the approximation is not corrected by CG), now that I think about it a little more, perhaps this is not a safe assumption at all. How do you think this should be handled? Maybe play it safe and force the eigenvalues to reset upon any gauge field change unless some explicit opt-in is set?

Did you try this with a PRECISION=1 MILC build?

I had neglected to test this after recent changes because an earlier investigation of single precision for our exact current calculations showed that it was not feasible for us. It was simply not precise enough. Nevertheless, I should've made sure that it works in principle. After fixing a number of bugs (b35fe11, a132898, b2b5463) it works now.

I have also taken the opportunity to make a few more updates to qudaExactCurrent. The commit 3b85f5c changes the function signature by changing the masses, jlow_mu, and jlow_mu2 from type double* to type void* since MILC uses type Real which is either float or double depending on MILC precision. It had worked, albeit with compiler warnings on the MILC side, but this makes it more correct, I think.

The commit 8357c77 does a refactoring of qudaExactCurrent to significantly reduce the number of small D2H transfers by replacing the per-eigenvector host accumulation with GPU-side accumulation and a small number of D2H transfers after the loop over eigenvectors.

weinbe2

This all looks good now, thank you for checking the precision = 1 builds---sounds like it indeed help make things more robust!

This is good to merge from my end.

leonhostetler added 16 commits August 6, 2025 14:52

Add qudaExactCurrent to MILC interface

ddda7ed

Generalized qudaExactCurrent to handle the multi-mass (differences) c…

f0ad43c

…ases

Minor cleanup

e9b45a3

Fix links not properly reloaded

8781ef7

Fix number of eigenvectors used in qudaExactCurrent

822266f

Merge branch 'develop' into feature/milc-exact-current

0db9621

Replace pool_device_malloc with qudaAllocatePinned to make Frontier h…

537a4bb

…appy

Revert allocator

7c496a4

Make eigensolver more verbose

9e687d0

Use full parity vectors with contractQuda

dc2dfe8

Merge branch 'develop' into feature/milc-exact-current

2cbbb62

Preserved evec spaces and eval masses are arrays now

70ee7c2

Cache and shift zero-mass eigenvalues to reduce unnecessary computeEv…

c026df8

…als calls

Eliminate redundant computeEvals call

383c9d6

Merge branch 'develop' into feature/milc-exact-current

65219ca

Clang format

59fa9e1

leonhostetler requested a review from a team as a code owner May 26, 2026 18:03

Fix CI error

ad5712d

maddyscientist approved these changes May 26, 2026

View reviewed changes

Comment thread lib/eig_trlm.cpp Outdated

Comment thread lib/milc_interface.cpp Outdated

leonhostetler added 3 commits May 27, 2026 11:16

Print 10 lanczosStep notifications per restart

f01a07d

Make evec/eval computations MRHS in qudaLoadDeflationSpace

4b9a4b6

Clang format

f1c1940

maddyscientist reviewed May 27, 2026

View reviewed changes

Comment thread lib/milc_interface.cpp Outdated

maddyscientist reviewed May 27, 2026

View reviewed changes

Comment thread lib/milc_interface.cpp Outdated

leonhostetler added 3 commits May 27, 2026 13:03

Replace logQuda with printfQuda where logQuda is redundant

b88eaf5

Replace copy+ax with axy

2581865

Clang format

1d4bc11

weinbe2 reviewed Jun 2, 2026

View reviewed changes

Comment thread lib/milc_interface.cpp Outdated

leonhostetler added 9 commits June 6, 2026 11:42

Add helper forceOuterDouble to reduce redundant code

5df64b0

Match the deflatable inverter's outer-precision promotion

b35fe11

Fix bug with double precision eigensolve (eigensolver_prec 2) called …

a132898

…from a single precision MILC run

Fix issues with qudaExactCurrent when running MILC in single precision

b2b5463

qudaExactCurrent: pass host masses/currents as void* + external_preci…

3b85f5c

…sion

Replace asserts with if + errorQuda

e8275c3

Merge branch 'develop' into feature/milc-exact-current

dfaf439

Clang format

26814ff

weinbe2 approved these changes Jun 12, 2026

View reviewed changes

weinbe2 merged commit 914b462 into develop Jun 12, 2026
6 of 7 checks passed

weinbe2 deleted the feature/milc-exact-current branch June 12, 2026 17:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MILC interface: Exact current#1633

MILC interface: Exact current#1633
weinbe2 merged 32 commits into
developfrom
feature/milc-exact-current

leonhostetler commented May 26, 2026

Uh oh!

maddyscientist left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

maddyscientist commented May 27, 2026

Uh oh!

Uh oh!

weinbe2 commented Jun 2, 2026

Uh oh!

leonhostetler commented Jun 6, 2026 •

edited

Loading

Uh oh!

weinbe2 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

leonhostetler commented May 26, 2026

Uh oh!

maddyscientist left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

maddyscientist commented May 27, 2026

Uh oh!

Uh oh!

weinbe2 commented Jun 2, 2026

Uh oh!

leonhostetler commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

weinbe2 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

leonhostetler commented Jun 6, 2026 •

edited

Loading