Skip to content

Imports#1

Merged
AllenWLynch merged 13 commits intomainfrom
imports
Apr 28, 2026
Merged

Imports#1
AllenWLynch merged 13 commits intomainfrom
imports

Conversation

@AllenWLynch
Copy link
Copy Markdown
Contributor

No description provided.

AllenWLynch and others added 13 commits April 13, 2026 17:58
- sbs: max-subtract log_locus/log_context before exp in _get_exp_offsets_k_c
  and restore the shift on the scalar normalizer in log-space; fall back to
  the previous normalizer if the slice is still degenerate
- theta_model: clip raw predictions to [-30, 30] at the log_locus_distribution
  write point so exp() stays within float32
- context_model: clip _get_log_context_distribution to [-30, 30]
- optim: finite-check normalizers after every get_exp_offsets_dict call and
  re-enable init_normalizers so epoch 1 doesn't start from zero
…om saved state

Pickling a TopographyModel was failing with "_pickle.PicklingError: Can't
pickle <class 'mutopia.gtensor.xarr_extensions.FetchSample'>: it's not the
same object as mutopia.gtensor.xarr_extensions.FetchSample".

Two compounding issues:

1. xarr_extensions.py defined four classes named FetchSample (for
   list_samples, mutate, fetch_sample, iter_samples) and two named AsCSR
   (for ascsr and asdense). Only the last definition is bound to the
   module name, so pickle's class-identity check failed for any of the
   shadowed classes once xarray had cached an accessor instance on a
   Dataset. Renamed each to a unique class name.

2. The Optuna trial callback in tuning.py was a partial that captured
   train[0] as an unused kwarg. The partial was stored on the model via
   set_params(callback=...) and pickled along with it, dragging the whole
   training dataset (including any cached accessors) into the saved
   state. Dropped the unused capture and the dead code that referenced
   it.

Also added TopographyModel.__getstate__ to null out `callback` before
pickling, so any future closures captured by training-loop hooks cannot
pin training data into a saved model.
Source of model instability has been identified, so the per-iteration
finite checks, log-space max-subtract, and clip(-30, 30) added in
f17c3c7 are no longer needed and add overhead in the offsets path.
`v / 1` was a debugging hack accidentally committed in d78a4f6;
restored to `v / extent` so bars normalize to the global max and
fit within the [0, 1.1] ylim.
- 31 tests covering inference, gtensor ops, save/load, plotting, and CLI
- fixtures hosted on sigscape/MuTopia 'test-fixtures' release; conftest
  downloads on first use and caches under tests/fixtures/
- tests/build_chr22_fixture.py regenerates fixtures from tutorial data
- tests.yml workflow runs on push and PR
- publish.yml gated on tests passing before PyPI upload
- joblib added as explicit dependency (was transitive via sklearn)
mutopia.__version__ is now read from package metadata rather than
hard-coded in __init__.py. Tagging vX.Y.Z is the only step needed
to release; setuptools_scm derives the wheel/package version from
the tag at build time. Between tags, dev installs report e.g.
1.0.8.dev3+gabc1234 automatically.

mutopia/_version.py is generated at build, gitignored.
Five new tests in tests/test_training.py exercise full-batch and SVI
training, save/reload after fit, and init_components with COSMIC names.
They are skipped by default; run with:

    pytest tests/test_training.py --runslow

Conftest grows a --runslow option and a 'slow' marker. CI keeps the
fast tier (no training) so PR checks stay under 30s.

Note: test_batch_training_test_scores_nondecreasing currently fails on
the chr22-train / chr1-test fixture (test score degrades monotonically
across epochs). Likely distributional mismatch between fixture chroms
or overfitting at k=2; worth investigating before relying on this as
a regression gate.
… Zenodo

Previous training tests used the chr22 fixture, which is too small and
distributionally dissimilar from the chr1 test set to give meaningful
convergence signal. Slow-tier tests now download Liver-HCC.nc (~540 MB)
from Zenodo record 18803136, cache to tests/fixtures/zenodo/, and split
on chr1 for held-out evaluation.

Run with:
    pytest tests/test_training.py --runslow

Default fast tier and CI are unchanged.
@AllenWLynch AllenWLynch merged commit 9fc41bd into main Apr 28, 2026
2 checks passed
@AllenWLynch AllenWLynch deleted the imports branch April 28, 2026 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant