Skip to content

MILC interface: Exact current#1633

Merged
weinbe2 merged 32 commits into
developfrom
feature/milc-exact-current
Jun 12, 2026
Merged

MILC interface: Exact current#1633
weinbe2 merged 32 commits into
developfrom
feature/milc-exact-current

Conversation

@leonhostetler

Copy link
Copy Markdown
Contributor

This is the second and final PR in the series to enable exact current calculations for HVP in the MILC/QUDA interface. The previous PR was #1590 .

Changes:

  1. Adds a new MILC interface function, qudaExactCurrent, in lib/milc_interface.cpp. This computes the exact low-mode contribution to the current densities needed for HVP current calculations.

  2. Implements eigenvalue shifting in the MILC interface for preserved deflation spaces. Previously, when using a preserved deflation space across multiple solves, the eigenvalues were recomputed each time the mass changed. With this update, computeEvals is called only once, during the first solve. The resulting zero-mass eigenvalues are cached and shifted as needed for subsequent masses.

    This avoids many unnecessary mat-vecs and greatly reduces output volume: eigenvalues are now printed only once instead of once per solve, which previously could produce O(10^6) lines of eigenvalue output in a typical production run.

  3. Makes lib/eig_block_trlm.cpp and lib/eig_trlm.cpp slightly more verbose by printing progress notifications every 10th or 100th Lanczos step. On production-size lattices, eigensolves can otherwise run for more than an hour without any visible progress, which can make a normally progressing job appear to be stalled.

Testing:

  • Verified exact-current results against the corresponding MILC-side implementation.
  • Verified that preserved deflation spaces reuse cached zero-mass eigenvalues and shift them correctly for subsequent masses.

@leonhostetler leonhostetler requested a review from a team as a code owner May 26, 2026 18:03

@maddyscientist maddyscientist left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I'll leave @weinbe2 to comment on the MILC interface specifics. I've left a couple of comments.

Comment thread lib/eig_trlm.cpp Outdated
Comment thread lib/milc_interface.cpp Outdated
Comment thread lib/milc_interface.cpp Outdated
Comment thread lib/milc_interface.cpp Outdated
@maddyscientist

Copy link
Copy Markdown
Member

Thanks for the tweaks @leonhostetler. I've left a couple of non-essential comments that might be nice to update if you make another push.

@weinbe2 any thoughts before we merge?

Comment thread lib/milc_interface.cpp Outdated
@weinbe2

weinbe2 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Thanks for the PR, @leonhostetler , this largely looks good. I have a few questions/thoughts:

  • Resetting the eigenvalue mass shift to the sentinel value of -1 seems a bit fragile still---qudaCleanupDeflationSpace() isn't triggered with a call to invalidateGaugeQuda(), or a fresh gauge field is loaded, for example. Maybe this is semi-intentional for heavy quarks where you want to plug in the heavy-quark-modified gauge fields, even if the eigenvectors technically change a bit, but maybe in that case we'd want a different routine for uploading those that doesn't invalidate the eigenvalue mass shift. I guess these considerations apply to the eigenvectors as well, so I should have considered this in an earlier PR.
  • Did you try this with a PRECISION=1 MILC build? I see you put in the proper plumbing to set the host precision in qudaExactCurrent, this is just making sure.

No other feedback, if you've verified that this works with the current workflow in MILC I trust you. It could be useful if you documented this on the QUDA Wiki in case other people wanted to use this, but the PR doesn't need to be blocked on that.

…igenvector D2H

Replace the per-eigenvector host post-processing in qudaExactCurrent with
device-side accumulation. The old loop copied each staggered contraction
(one complex/site) to the host and summed scaled imaginary parts on the
CPU -- ~4*n_evec*2 tiny latency-bound D2H transfers, plus hand
reinterpret_cast of a raw void* buffer.

Instead, wrap the two halves of the contraction output (even/odd parity)
as single-parity nColor=1, nSpin=1 reference fields and sum the scaled
complex contractions on the device with blas::axpy. The imaginary
part is extracted once at the end into MILC's interleaved jlow arrays,
reducing the device->host traffic to 8 (or 16) larger reads.
@leonhostetler

leonhostetler commented Jun 6, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for reviewing this, @weinbe2 !

Resetting the eigenvalue mass shift to the sentinel value of -1 seems a bit fragile still

I agree. On the one hand, in practice, I don't know that we have any workflows where we load a completely different gauge config during a run. In principle though, we should probably guard against that possibility. As for the heavy quarks, yes, I figured that since the eigenvectors from the non-Naik'ed operator were already being used as an approximation to the real eigenvectors, then why not the eigenvalues as well? For deflated CG, I figured/assumed this would be fine in that they should be a good approximation and the CG should correct any errors. For other uses of the eigenvalues (as in qudaExactCurrent where the approximation is not corrected by CG), now that I think about it a little more, perhaps this is not a safe assumption at all. How do you think this should be handled? Maybe play it safe and force the eigenvalues to reset upon any gauge field change unless some explicit opt-in is set?

Did you try this with a PRECISION=1 MILC build?

I had neglected to test this after recent changes because an earlier investigation of single precision for our exact current calculations showed that it was not feasible for us. It was simply not precise enough. Nevertheless, I should've made sure that it works in principle. After fixing a number of bugs (b35fe11, a132898, b2b5463) it works now.

I have also taken the opportunity to make a few more updates to qudaExactCurrent. The commit 3b85f5c changes the function signature by changing the masses, jlow_mu, and jlow_mu2 from type double* to type void* since MILC uses type Real which is either float or double depending on MILC precision. It had worked, albeit with compiler warnings on the MILC side, but this makes it more correct, I think.

The commit 8357c77 does a refactoring of qudaExactCurrent to significantly reduce the number of small D2H transfers by replacing the per-eigenvector host accumulation with GPU-side accumulation and a small number of D2H transfers after the loop over eigenvectors.

@weinbe2 weinbe2 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good now, thank you for checking the precision = 1 builds---sounds like it indeed help make things more robust!

This is good to merge from my end.

@weinbe2 weinbe2 merged commit 914b462 into develop Jun 12, 2026
6 of 7 checks passed
@weinbe2 weinbe2 deleted the feature/milc-exact-current branch June 12, 2026 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants