Skip to content

fix(dynacell): green up dynacell-configs CI (run_for_group import + a549 any-GPU test)#446

Merged
edyoshikun merged 5 commits into
ci/staging-green-upfrom
ci/dynacell-eval-green-followup
Jun 2, 2026
Merged

fix(dynacell): green up dynacell-configs CI (run_for_group import + a549 any-GPU test)#446
edyoshikun merged 5 commits into
ci/staging-green-upfrom
ci/dynacell-eval-green-followup

Conversation

@alxndrkalinin

Copy link
Copy Markdown
Collaborator

What

Closes the last two Test dynacell benchmark configs failures that #435 leaves open. Stacked on ci/staging-green-up — its effective diff against the base is just two files (the predict→any-GPU config change is already on the base via the staging merge, so it reconciles to a no-op here).

Fixes

  1. run_for_group cross-condition probe entry point (cross_condition_probe.py)
    pipeline.py imports run_for_group from cross_condition_probe, but the function was never committed → dynacell.evaluation.pipeline is unimportable in a clean checkout (ImportError: cannot import name 'run_for_group'), so test_evaluation_grouped + both test_evaluation_pipeline_parallel* suites (17 tests) fail at collection. Adds the function (+ _write_rows) matching the call site: runs the existing pairwise probe per infected condition against mock and writes one CSV per eval dir.

  2. a549 predict-leaf constraint test → is None (test_benchmark_config_composition.py)
    The staging merge unpinned predict leaves from H200 (now hardware_predict_any_gpu, constraint: null), but test_a549_predict_leaf_composes still asserted == "h200" → 24 failures. The test now asserts constraint is None, matching the any-GPU predict profile.

Verification

Full test-dynacell-configs job set green against the merged tree (incl. #429 eval refactor): test_benchmark_config_composition, test_submit_benchmark_job, test_runtime, test_evaluation_pipeline_parallel, test_evaluation_pipeline_parallel_cpu, test_evaluation_grouped0 failed.

🤖 Generated with Claude Code

alxndrkalinin and others added 5 commits June 2, 2026 14:54
pipeline.py imports run_for_group from cross_condition_probe to write a
per-condition mock-vs-infected probe CSV into each condition's eval dir,
but the function was never committed. As a result
dynacell.evaluation.pipeline was unimportable in a clean checkout
(ImportError: cannot import name 'run_for_group'), so the grouped and
parallel eval tests failed at collection in CI.

Add run_for_group (and the _write_rows helper) matching the pipeline
call site: it runs the existing pairwise probe per infected condition
against mock and writes one CSV per eval dir, colocated with the dir the
reporting layer already resolves per (model, pool, organelle, condition).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Predict jobs were pinned to H200 by every leaf composing
hardware_h200_single.yml. Predict at FP32 fits in ~7 GB and is
compute-bound (measured 6.6 GB / 100% SM / 15-20% bandwidth on celldiff
a549-trained membrane mock, H200), so Hopper HBM/bandwidth is unused —
A40 / A6000 / L40S / L4 all run it and drain the queue faster.

Add hardware_predict_any_gpu.yml (same resource shape as
hardware_h200_single, constraint=null) and swap all 233 tracked predict
leaves across celldiff, fcmae_vscyto3d_pretrained, fcmae_vscyto3d_scratch,
fnet3d_paper, pix2pix3d_unetvit, and unetvit3d to use it. Train leaves
stay on hardware_h200_single / hardware_4gpu (memory + bandwidth profile
differs). README updated.

test_a549_predict_leaf_composes now asserts sbatch.constraint is None
(any GPU) instead of "h200". This is the config-side fix for the 24 a549
composition failures #435 left open by re-pinning predicts to H200.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…op dead param)

Post-review cleanups, no behavior change:
- Iterate _DEFAULT_PAIRS instead of hardcoding ("denv","zikv") + building
  ("mock", cond) by hand — keeps the mock-vs-infected pair set in one place,
  so a future infected condition is picked up automatically.
- Drop the unused filename parameter (no caller passes it) and write to
  GROUP_PROBE_FILENAME directly.
- Drop the needless list() around the _FIELDNAMES tuple in _write_rows
  (csv.DictWriter accepts any sequence).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
test_a549_predict_leaf_composes used .get("constraint") is None, which
also passes if the hardware profile is dropped from the leaf's base chain
(missing key -> None). Subscript the key so a dropped any-GPU profile
fails loudly instead of silently — the any-GPU profile composes an
explicit constraint=null, so the key is always present when applied.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alxndrkalinin alxndrkalinin requested a review from Copilot June 2, 2026 22:25
@edyoshikun edyoshikun merged commit 146ac39 into ci/staging-green-up Jun 2, 2026
1 check passed
@edyoshikun edyoshikun deleted the ci/dynacell-eval-green-followup branch June 2, 2026 22:27

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses remaining CI failures in the dynacell benchmark-config test suite by restoring a missing evaluation helper entry point and aligning a benchmark-config composition test with the new “any GPU” predict hardware profile.

Changes:

  • Add run_for_group() (plus _write_rows() + shared CSV fieldnames) to dynacell.evaluation.cross_condition_probe and refactor run() to reuse the CSV writer helper.
  • Update test_a549_predict_leaf_composes to assert sbatch.constraint is None for any-GPU predict leaves, and to index the key directly so a missing hardware overlay fails loudly.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
applications/dynacell/src/dynacell/evaluation/cross_condition_probe.py Adds the missing grouped probe entry point used by the evaluation pipeline; centralizes CSV writing.
applications/dynacell/tests/test_benchmark_config_composition.py Updates A549 predict-leaf constraint assertion to match the any-GPU predict profile behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

alxndrkalinin added a commit that referenced this pull request Jun 3, 2026
* ci: drop Python 3.11 from matrix and trigger on modular-viscy-staging

All workspace packages require python>=3.12 but the CI matrix included
3.11, causing uv sync to fail with 'incompatible with the project's
Python requirement: >=3.12' and fail-fast cancelling all other matrix
jobs.

Also extend push/pull_request triggers to modular-viscy-staging so PRs
targeting the 0.5 staging branch run lint/test.

* fix(deps): declare viscy-transforms test dep and hydra-core for dynacell

- packages/viscy-data tests import viscy_transforms (test_combined.py,
  test_hcs.py, test_hcs_weighted_crop.py) but it was not declared in the
  test group, causing ModuleNotFoundError in Test Data Extras CI.
- applications/dynacell/src/dynacell/evaluation/pipeline.py:9 imports
  hydra unconditionally. hydra-core was only present in the 'eval' and
  'report' optional extras, so 'uv sync --group test' (used by Test
  dynacell benchmark configs CI) failed at collection with
  ModuleNotFoundError: No module named 'hydra'. Promote hydra-core to a
  base dependency since it's needed at import time.

* fix(dynaclr): rename test_2d_mip_augmentation.py to demo_; scope pytest to tests/

The file is a Jupyter-style demo script (# %% cells) with a hardcoded
absolute home path /home/eduardo.hirata/... It is meant to be invoked
via 'uv run python ...', not collected by pytest, but its test_ prefix
caused 'Test (dynaclr)' CI to fail with FileNotFoundError on the missing
parquet at collection time.

- Rename to demo_2d_mip_augmentation.py so pytest does not collect it.
- Add testpaths=['tests'] under [tool.pytest.ini_options] in dynaclr's
  pyproject.toml so future demo scripts under scripts/ are not picked up
  when CI invokes pytest from applications/dynaclr/.

* style: satisfy pre-commit (ruff format, EOF, D100/D102, E741, NPY002)

Auto-fixes from prek/pre-commit hooks:
- ruff format: distance.py, lca.py, normalize.py, precompute.py,
  conftest.py (5 files reformatted, no semantic change)
- end-of-file-fixer: append trailing newline to .envrc

Manual fixes:
- E741: rename ambiguous 'O' to 'Ov' (overlap) in celldiff_wrapper.py
  sliding-window patcher.
- D100: add module docstrings to plot_dim_reduct.py, tau_sampling.py,
  channel_dropout.py.
- D102: add forward() docstring in ChannelDropout.
- NPY002: replace np.random.randn() with np.random.default_rng() in
  test_mp_utils.py.

The renamed demo_2d_mip_augmentation.py is in this commit because
ruff format touched its imports as part of the same auto-pass.

* fix(dynaclr): unpack tuple from ExperimentRegistry.from_cell_index in e2e test

from_cell_index returns (ExperimentRegistry, DataFrame); the fixture assigned
the tuple to a bare `registry` and passed it to MultiExperimentIndex, which
raised `AttributeError: 'tuple' object has no attribute 'experiments'` at
setup of every TestFlatParquetRegistry / TestChannelModes / TestSamplerGroupings
test. Unpack and discard the DataFrame (the fixture builds its own).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(dynacell): point predict-leaf test at split celldiff inference configs

celldiff's predict__ipsc_confocal.yml was split into per-method variants
(__denoise, __iterative, __sliding_window), but test_predict_leaf_composes
still expected the single file for every model in PREDICT_LEAVES, raising
FileNotFoundError for all four celldiff organelles. Add PREDICT_LEAF_FILES
parametrizing celldiff's three variants plus unetvit3d's single file, and
switch the predict-composition test to it. PREDICT_LEAVES is retained for
the eval symlink test, whose eval__ipsc_confocal.yaml is unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* style: ruff format PREDICT_LEAF_FILES comprehension

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(viscy-data): pin iohub codec pipeline to zarr-python, not zarrs

iohub auto-enables the Rust `zarrs` codec pipeline whenever `zarrs` is
importable (it ships as a transitive dep of iohub 0.3.x). `zarrs` 0.2.3
builds a rank-6 read offset for rank-5 sharded zarr-v3 arrays, raising
`RuntimeError: incompatible offset [0,0,0,0,0,0] for region with start
[0,0,0,0,0]` on every sharded v3 read, and deadlocks spawned DDP ranks.

iohub re-applies its codec choice on every store open, so a one-time
`zarr.config.set(...)` is clobbered by the next `open_ome_zarr`. Override
iohub's ZarrConfig field default instead, applied on import of viscy_data
(the lowest zarr-reading package; everything else depends on it) so the
native `BatchedCodecPipeline` is in place before any store is opened.

Fixes the zarr_v3 failures in test-data / test-data-extras and the
test_combined_ddp hang.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(dynacell): lazy-import torch_fidelity in feature_metrics

torch_fidelity is in dynacell's [eval] extra, but feature_metrics.py
imported its metric helpers at module top. The minimal CI test job
(test group only, no eval extras) imports the eval pipeline via
test_evaluation_grouped.py, so collection failed with
`ModuleNotFoundError: No module named 'torch_fidelity'` at pipeline.py:18
-> feature_metrics.py:18.

Move the four torch_fidelity imports into the _fid/_kid/_bootstrap_prc/
_mind functions that use them. The cache-only grouped parity test
short-circuits before any metric runs, so the eval pipeline now imports
without torch_fidelity, matching the CI job's documented minimal-env intent.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(dynaclr): repoint class-path-resolution tests at demo configs

configs/training/fit.yml and multi_experiment_fit.yml were removed in the
config reorg (ecf7c08 "refactor training configs into composable
_base/arch/data layers"), but two tests still asserted they exist and that
their class_paths resolve, failing with "Config file not found".

Point them at the current full demo configs: demo/demo_2d_fit.yml and
demo/demo_bag_of_channels_v3_fit.yml. The predict.yml parametrization is
unchanged (it still exists). All resolved class_paths import cleanly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(dynacell): align 4gpu-constraint test with hardware_4gpu.yml (h100|h200)

test_4gpu_train_leaves_inherit_a100_exclude asserted constraint
'h100|h200|a40|a6000|l40s', but hardware_4gpu.yml sets 'h100|h200' and its
prose explains why: 4-GPU FCMAE/UNeXt2 train at large patches where a single
DDP rank needs 30-50 GB, so the 48 GB A40/A6000/L40S cards have no headroom.
The config is authoritative; the test constant was inconsistent (both landed
together in #404). Update the expected constraint and the test's rationale
comment to match. The file's override-hint (opt in to smaller cards) is
unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(viscy-data): make mmap-sharing test child picklable for spawn

test_mmap_preload_multi_process_sharing defined its child target as a nested
closure, which fails under the spawn start method (Windows/macOS default)
with `AttributeError: Can't get local object`. fork (Linux) inherits the
function so it passed there. Move the child to module level as
_mmap_sharing_child and pass the data path explicitly instead of closing
over the fixture. Verified passing under a forced spawn context.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(dynacell): reconcile benchmark-config tests with split/redesigned leaves

The config layout was refactored but test_benchmark_config_composition.py
still asserted the old layout, failing once collection-blocking imports were
fixed. Reconcile the tests against the current configs:

- test_a549_predict_leaf_composes: predict__a549_mantis.yml was split per
  treatment condition (_mock/_denv/_zikv). Generate expectations over
  gene-info x model x condition; assert the condition-specific data_path,
  dataset_ref, and experiment_id each composed leaf actually produces.
- test_migrated_target_predict_resolves_to_test_store: celldiff's
  predict__ipsc_confocal.yml was split per inference method; point celldiff
  at the sliding_window representative (all three resolve the same store).
- test_joint_train_leaf_composes / _smoke_: the joint train leaf was
  redesigned to single-GPU H200 (strategy auto, devices 1, bare-string
  channels) and the a549 child path moved to mantis_v1/train/SEC61B_all.zarr.
  Update assertions to match; topology still satisfies the SLURM invariant.

Configs unchanged. Full file: 165 passed, 52 skipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(viscy-data): use fork context + Windows skip for mmap-sharing test

Making the child target module-level was necessary but not sufficient: under
the spawn start method (Windows default) the child re-imports the test
module, which pytest loaded via --import-mode=importlib, failing with
ModuleNotFoundError: No module named 'packages'. Use an explicit fork
context (child inherits the loaded module, no re-import) and skip on Windows
where fork is unavailable -- the same pattern test_combined_ddp.py already
uses for this exact issue.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(dynaclr): import NTXentLoss from viscy_models

Engine's isinstance check targets viscy_models' NTXentLoss
subclass, so importing pml's base class fell through to the
triplet branch and passed a 2-D projection as the 1-D label
arg ("labels must be a 1D tensor").

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(dynaclr): use bag-of-channels for heterogeneous experiments

All-channels mode is undefined when experiments carry different
channel sets: norm_meta collapses to one channel per cell after
dedup, and _scatter_channels zips the marker union positionally
against a single experiment's patch. Bag-of-channels
(channels_per_sample=1) is the supported path — one channel per
sample keyed channel_0.

- _C 2 -> 1, channel_dropout_channels [1] -> [0]
- drop bogus collection_path kwarg from the parquet case

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(dynaclr): repoint multi-experiment parquet test at cell_index_path entry point

* deps: regenerate uv.lock after green-up merge

* fix(dynacell): green up dynacell-configs CI (run_for_group import + a549 any-GPU test) (#446)

* feat(dynacell/eval): add run_for_group cross-condition probe entry point

pipeline.py imports run_for_group from cross_condition_probe to write a
per-condition mock-vs-infected probe CSV into each condition's eval dir,
but the function was never committed. As a result
dynacell.evaluation.pipeline was unimportable in a clean checkout
(ImportError: cannot import name 'run_for_group'), so the grouped and
parallel eval tests failed at collection in CI.

Add run_for_group (and the _write_rows helper) matching the pipeline
call site: it runs the existing pairwise probe per infected condition
against mock and writes one CSV per eval dir, colocated with the dir the
reporting layer already resolves per (model, pool, organelle, condition).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(dynacell): unpin predict from H200 via any-GPU launcher profile

Predict jobs were pinned to H200 by every leaf composing
hardware_h200_single.yml. Predict at FP32 fits in ~7 GB and is
compute-bound (measured 6.6 GB / 100% SM / 15-20% bandwidth on celldiff
a549-trained membrane mock, H200), so Hopper HBM/bandwidth is unused —
A40 / A6000 / L40S / L4 all run it and drain the queue faster.

Add hardware_predict_any_gpu.yml (same resource shape as
hardware_h200_single, constraint=null) and swap all 233 tracked predict
leaves across celldiff, fcmae_vscyto3d_pretrained, fcmae_vscyto3d_scratch,
fnet3d_paper, pix2pix3d_unetvit, and unetvit3d to use it. Train leaves
stay on hardware_h200_single / hardware_4gpu (memory + bandwidth profile
differs). README updated.

test_a549_predict_leaf_composes now asserts sbatch.constraint is None
(any GPU) instead of "h200". This is the config-side fix for the 24 a549
composition failures #435 left open by re-pinning predicts to H200.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(dynacell/eval): tidy run_for_group (reuse _DEFAULT_PAIRS, drop dead param)

Post-review cleanups, no behavior change:
- Iterate _DEFAULT_PAIRS instead of hardcoding ("denv","zikv") + building
  ("mock", cond) by hand — keeps the mock-vs-infected pair set in one place,
  so a future infected condition is picked up automatically.
- Drop the unused filename parameter (no caller passes it) and write to
  GROUP_PROBE_FILENAME directly.
- Drop the needless list() around the _FIELDNAMES tuple in _write_rows
  (csv.DictWriter accepts any sequence).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(dynacell): assert predict constraint key present-and-null

test_a549_predict_leaf_composes used .get("constraint") is None, which
also passes if the hardware profile is dropped from the leaf's base chain
(missing key -> None). Subscript the key so a dropped any-GPU profile
fails loudly instead of silently — the any-GPU profile composes an
explicit constraint=null, so the key is always present when applied.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Alexandr Kalinin <alxndrkalinin@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(dynacell/eval): unblock dynacell-configs CI (lazy cubic import) + cross-condition probe hardening (#447)

* perf(dynacell/eval): memoize embedding loads across cross-condition pairs

_probe_pair re-read each NPZ from disk on every call, so the shared mock
reference side was loaded once per pair: run_for_group read the 8 mock
files (4 features x 2 sources) 16 times, and run re-read each condition
once per pair. Thread an optional per-call dict cache keyed on the NPZ
path through _probe_pair -> _load_embeddings; run / run_for_group create
one and pass it so each file is read once per group.

Measured on a synthetic 3-condition group: run_for_group drops from 32 to
24 np.load reads (~26 MB of redundant mock NPZ I/O per group avoided). The
cache is local to each call and released on return, so memory stays bounded
to one group's embeddings rather than accumulating across groups (which a
module-level lru_cache would do).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(dynacell/eval): raise on duplicate condition in run_for_group

run_for_group built its condition->dir map with a last-wins assignment,
so two dirs mapping to the same condition (e.g. two *_mock dirs) silently
overwrote each other and produced a probe against an arbitrary dir with no
signal. run() already raises ValueError on duplicates; mirror that here so
an ambiguous group surfaces as an error (the grouped pipeline catches it
and safely skips the probe). Unrecognized-token dirs are still skipped, as
documented.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(dynacell/eval): lazily import cubic in instance_metrics

instance_metrics imported cubic.metrics.average_precision at module top
level, and pipeline_cache imports DEFAULT_IOU_THRESHOLDS from it — so the
whole pipeline import chain pulled the GPU-only cubic stack. The minimal
`test-dynacell-configs` CI job (uv sync --group test, no eval extra) then
failed to import dynacell.evaluation.pipeline with
`ModuleNotFoundError: No module named 'cubic'`, taking down every
grouped / parallel / parallel_cpu eval test at collection.

Move the cubic import into instance_average_precision (its only use), so
importing the module for the threshold constant no longer needs cubic.
The AP computation still hard-requires cubic and fails loudly there.
Mirrors the lazy torch_fidelity imports in feature_metrics. Verified:
dynacell.evaluation.pipeline now imports with cubic unavailable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Alexandr Kalinin <alxndrkalinin@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ci): install dynacell eval extra instead of hiding cubic imports

The test-dynacell-configs job installed only base deps + the test group,
so the eval-pipeline tests (which import dynacell.evaluation.pipeline)
failed with `ModuleNotFoundError: No module named 'cubic'` once the import
chain reached instance_metrics / segmentation. cubic is a *hard* dependency
of the eval pipeline (evaluation/CLAUDE.md: do not hide its imports), and
it is CPU-capable — it falls back to numpy/scikit-image without CUDA — so
the right fix is to install it, not to lazily defer its import.

- CI: `uv sync --frozen --extra eval --group test` (CPU; eval_gpu cupy/cucim
  stays out — those are the only CUDA-only deps).
- Revert the lazy `from cubic.metrics import average_precision` in
  instance_metrics back to a module-level import (restores the convention;
  the deferral added earlier is no longer needed now that cubic is present).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(dynacell/eval): restore module-level torch_fidelity imports

torch-fidelity is a hard dependency of the eval extra (and the
test-dynacell-configs CI job now installs that extra), so there is no
reason to defer its imports into _fid/_kid/_mind/_bootstrap_prc. Move
fid/kid/mind/prc helper imports back to the module top, matching the
convention that eval deps are imported eagerly and fail loud if absent.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(dynacell/eval): don't hard-require cupy/cucim; use cubic CPU fallback

_require_cubic (in metrics.py and io.py) raised
"<pkg> is required for GPU-backed metrics … uv sync --extra eval_gpu"
whenever cupy/cucim were absent — even on the CPU path. That defeated
cubic's own numpy/scikit-image fallback: compute_pixel_metrics and
_cp_raw_regionprops already gate the GPU upload on
`torch.cuda.is_available()` and use asnumpy/numpy otherwise, where cubic
dispatches to CPU. The eager gate fired before that code ran, so a
GPU-less runner with the eval extra (but no eval_gpu) — e.g. GitHub CI —
failed instead of falling back.

Drop the cupy/cucim requirement from both _require_cubic; keep the cubic
presence check. On an actual GPU run without cupy, `ascupy` still raises a
clear "GPU requested but not available". Verified: the eval-pipeline +
grouped tests pass with the GPU hidden (CUDA_VISIBLE_DEVICES="") across
the serial and spawn-process paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(dynacell/eval): resolve NPZ path for the embedding cache key

_load_embeddings keyed its memoization cache on the unresolved
eval_dir/embeddings/<f>.npz path while the docstring promised the
*resolved* path — a doc/code mismatch that could also defeat caching
when the same file is reached via different relative paths or symlinks.
Resolve the path (strict=False, so a missing file still surfaces as
np.load's FileNotFoundError) before the cache lookup and load.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* buump iohub 0.3.6

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Sricharan Reddy Varra <sricharan.varra@biohub.org>
Co-authored-by: Alexandr Kalinin <1107762+alxndrkalinin@users.noreply.github.com>
Co-authored-by: Alexandr Kalinin <alxndrkalinin@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants