feat: add gradient fallback helpers (tesseract_core.runtime.experimental.vjp_from_jacobian, ...) for deriving AD endpoints from each other#511
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #511 +/- ##
==========================================
+ Coverage 67.74% 77.06% +9.32%
==========================================
Files 31 32 +1
Lines 4291 4348 +57
Branches 705 718 +13
==========================================
+ Hits 2907 3351 +444
+ Misses 1146 710 -436
- Partials 238 287 +49 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I'm wondering if we could address this in a less magical way by providing functions like |
|
@dionhaefner good idea! I think the only advantage of this approach is that we can add such endpoints at serve time and do not the rebuild the tesseract. But I also see arguments against this, in a way it changes the predefined contract of a tesseract which is a bit of an antipattern. I am leaning towards the approach you just proposed. |
|
Yes I think the only time this is worse is when there's someone wanting to use a pre-built Tesseract image in a workflow that requires VJPs but the Tesseract only has Jacobian defined. OTOH we can still enable things globally if the experimental feature proves really useful. |
Benchmark ResultsBenchmarks use a no-op Tesseract to measure pure framework overhead. 🚀 0 faster, ✅ No significant performance changes detected. Full results
|
|
@andrinr Please adjust the PR description and title (so we get a meaningful entry in the changelog). |
Co-authored-by: Dion Häfner <dion.haefner@simulation.science>
Co-authored-by: Dion Häfner <dion.haefner@simulation.science>
Co-authored-by: Dion Häfner <dion.haefner@simulation.science>
tesseract_core.runtime.experimental.vjp_from_jacobian, ...) for deriving AD endpoints from each other
dionhaefner
left a comment
There was a problem hiding this comment.
Looks great, thanks!
Relevant issue or PR
Resolves #465 when merged
Description of changes
Adds four experimental helpers in
tesseract_core.runtime.experimentalfor derivingmissing AD endpoints from ones already implemented:
jvp_from_jacobian/vjp_from_jacobian— derive JVP or VJP from an existingjacobianendpoint by contracting the Jacobian matrix with the tangent / cotangent vectorjacobian_from_jvp— materialise the full Jacobian by probing with N one-hot tangents (N = input elements); output shapes are inferred from the first probe, so no extraapplycall is neededjacobian_from_vjp— materialise the full Jacobian by probing with M one-hot cotangents (M = output elements); takes aneval_fnargument (eitherapplyor the cheaperabstract_eval) to determine output shapes upfrontAll helpers live in
tesseract_core/runtime/ad_endpoint_derivation.pyand are re-exported throughtesseract_core.runtime.experimental.Testing done
tests/runtime_tests/test_autodiff_fallbacks.py— unit tests parametrised over two nonlinear functions (wideR⁴→R³and tallR²→R⁴);jacobian_from_vjpadditionally parametrised overapplyvsabstract_eval; round-trip tests verify VJP→Jacobian→JVP and JVP→Jacobian→VJP chainsexamples/univariate_adfallbacks— e2e Rosenbrock variant usingjvp_from_jacobian/vjp_from_jacobian; coverstest_apply,test_jacobian_wrt_x,test_jvp,test_vjp, andcheck_gradients; registered intests/endtoend_tests/test_examples.pywithcheck_gradients=True