feat(config): out-of-process NEST kernel for config-build-time checks (#227)#235
Conversation
…#227) Building a `NestSimulation` config used to mutate NEST's global kernel state in the user's process via `NestSynapseSettings.__boot__` (which called `nest.GetDefaults`). This change introduces a `ContextVar`-backed `BuildContext` set by the root build, into which `bsb-nest` lazily registers a `multiprocessing.managers.BaseManager`-backed proxy that runs NEST in a subprocess. The `delay` `required=` checker queries the proxy instead, so config loading no longer imports NEST in-process. Highlights: - `bsb.config.build_context()` / `BuildContext` / `get_/set_config_build_context()`, with LIFO cleanup callbacks; `wrap_root_postnew` activates it for the whole root build incl. `_resolve_references`. - `NodeKwargs` carries the node `instance` so `required=` callables can walk parents (used to discover the enclosing `NestSimulation.modules`). - `bsb_nest.get_nest_kernel_proxy()` spawns a `NestKernelManager` subprocess on first need, stores the proxy at `ctx.bsb_nest.kernel`, and shuts it down on context exit. - `NestSynapseSettings.delay` is back to a callable `required=` (`_is_delay_required`). Any failure to reach the proxy or look up the model warns + falls back to `required=False`, matching the spec. - Top-level `import nest` statements in `bsb_nest` are made lazy so that `import bsb_nest` no longer drags NEST into the user's process. Behaviour change: the unknown-model and "model needs delay" checks used to hard-fail at boot via `ConfigurationError`; they now warn at config time and surface later (e.g. at `nest.Connect`). Updated `test_unknown_synapse` and `test_error_gap_junctions_syn` accordingly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #235 +/- ##
==========================================
+ Coverage 80.54% 85.42% +4.87%
==========================================
Files 188 103 -85
Lines 18179 11813 -6366
Branches 2174 1387 -787
==========================================
- Hits 14642 10091 -4551
+ Misses 2977 1415 -1562
+ Partials 560 307 -253 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The previous commit lumped "unknown model" together with "can't reach the kernel" — both warned and fell back to required=False. But the two failure modes are conceptually different: if we *can* reach the kernel and it tells us the model doesn't exist, that's a real config error and deserves the same hard fail that `NestSynapseSettings.__boot__` used to give pre-#228. The soft warn-and-fall-back is reserved for cases where we genuinely can't reach the proxy (no build context, kernel spawn failed, IPC error). - `_is_delay_required` now raises `ConfigurationError` for unknown models when the proxy lookup succeeded; only proxy-unreachable / IPC-failure paths warn and return False. - `test_unknown_synapse` in test_nest.py expects `ConfigurationError` again (was temporarily relaxed to `assertWarns(KernelWarning)`). - `test_unknown_synapse_warns_and_falls_back` renamed to `test_unknown_synapse_is_hard_error_when_proxy_reachable` and updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- `_build_context.py`: drop the private `_AutoNamespace` base class so Sphinx's autodoc doesn't trip on the unresolved cross-reference; inline the auto-vivification into `BuildContext.__getattr__` and use `contextlib.suppress` for the cleanup callback's `try/except/pass`. - Combine nested `with` blocks across the test files (SIM117). - Add `raise ... from e` to the NEST skip in test_build_context.py (B904). - Ruff also reorganised the `bsb` import block in `connection.py` (line length) and removed an unused `KernelWarning` import from test_nest.py that the prior commit left behind. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test fetches the Allen mouse brain ontology from api.brain-map.org and fails with FileNotFoundError when the API is unreachable (currently returning 503). Same `skip_test_allen_api` helper that already guards `TestAllenVoxels` works here — apply it at the method level since the rest of `TestNrrdVoxels` is network-independent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new `config/build-lifecycle.rst` covering the four-phase config load (parse → build → boot → use), the essential `__post_new__` and `__boot__` hooks, and the `BuildContext` API introduced by #235: how to register and look up shared resources on the active build context, the `__dict__.get(...)` idiom for non-vivifying leaf reads, and how `required=` callables consume it. Includes the bsb_nest NEST proxy as a worked example. Also fills in a previously undocumented capability — `required=` accepts a callable, not just a bool — in `config/attributes.rst`, and links it to the new build-lifecycle page. Wires the new page into the configuration toctree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
House rule: no em dashes in docs. Rewrote the parenthetical asides in build-lifecycle.rst and attributes.rst with colons, commas, and parentheses instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the Conventions section of docs/dev/documentation.rst with the house style rules that were previously only tribal knowledge: no em/en dashes, no development-history references in committed text, and "the BSB" in prose. Adds a root CLAUDE.md that points agents at that section and states the mechanical pre-finish grep check for stray dashes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The conventions live in docs/dev/documentation.rst; CLAUDE.md just points there instead of restating the rules. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Claude files live under .claude/ (alongside package-boilerplate.md and skills/), so the pointer file belongs there too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the convention to docs/dev/documentation.rst and applies it to the build-lifecycle refs so they render the final component (e.g. BuildContext) instead of the full dotted path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the "required= checkers" and "NEST kernel as a proxy" sections from build-lifecycle.rst; they were too feature-specific for a general lifecycle page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The lazy in-method `import nest` refactor was unnecessary: bsb-nest depends on nest, so importing it at module top is fine. The out-of-process kernel proxy is what keeps config building from mutating the in-process kernel; avoiding the import itself was extra scope. Restores top-level `import nest` across adapter, cell, device, distributions, and the device plugins, and drops the now-moot "import bsb_nest is nest-free" test. Also removes the `if __name__ == "__main__": unittest.main()` blocks from the new test modules; not a pattern used here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
distributions.py builds its distribution-name validation list at import time, so a top-level nest import would pull NEST in just to enumerate names during config attribute setup. Keep the deferred _LazyDistributionNames so that computation only touches NEST when a name is actually validated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@Helveg, It seems that the scaffold = from_storage("my_network.hdf5")
import nest
nest.Models() # contains the module that should be loaded only by the simulation adatper |
Clarifies that the handle is the node mid-construction: its identity, parent, and key are set, but its attributes are not yet assigned when the requirement check runs. Adds a docstring documenting the depth-first, parent-first build order and that ancestors are only partially built. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves the "node is only partially built during a required= check" caveat into a `.. warning::` admonition so it stands out, including the depth-first/parent-first ordering and the declared-before-branch rule for sibling attribute visibility. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tightens the admonition and clarifies the key reassurance: build order follows attribute declaration order, not the order keys appear in the user's configuration, so parent-attribute availability is deterministic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add `_NestKernel.load_modules`, which installs a simulation's modules into the out-of-process kernel only once per build and reports the ones it can't find, plus a shared `load_simulation_modules` helper that walks up to the enclosing simulation and raises a `ConfigurationError` for a missing module. The synapse delay checker now uses the helper instead of its own install loop. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…onfigs `NestSimulation.__boot__` installed the configured modules into the in-process kernel during `from_storage`, so merely loading a network changed `nest`'s global state. Drop the hook; modules are installed at adapter prepare time and, for config-time validation, into the out-of-process proxy. Validate cell models through a `nest_node_model` type handler that queries the proxy, mirroring the synapse model check, instead of a boot-time in-process `nest.Models` lookup. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Factor the cell model validator into a `NestModelTypeHandler` base on the kernel proxy and add a `nest_synapse_model` subclass so synapse models are validated the same way: on the `model` attribute rather than as a side effect of the `delay` `required=` checker, which now only answers whether a delay is required. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@drodarie good catch, you were right. I audited every
While in there I also:
Net effect: loading a config validates cell + synapse models and installs modules entirely out-of-process, so the in-process
|
drodarie
left a comment
There was a problem hiding this comment.
Quite a technical solution, I think I understand the idea behind it but there is a major flaw: node attributes initialization order is not guaranteed because they are part of a dict. But here we need the modules attribute of NestSimulation to be set so that it can be used for cell_models or connection_models
Attribute build order is taken from get_config_attributes, which merged each class's attributes parent-first. An overridden attribute kept its parent position, so a subclass attribute (e.g. NestSimulation.modules) declared before an overridden one (cell_models) was built after it. Build order is unrelated to the order keys appear in the user's config, so a checker that reads a sibling set earlier in the subclass body could not rely on it being assigned yet. Splice each class's declared attributes into the inherited order instead: overrides keep their inherited slot and a subclass's new attributes are threaded in at the position the subclass declares them, so every class's declaration order is preserved as a subsequence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A subclass can now reorder inherited attributes by redeclaring them, as long as it redeclares every attribute that sits between the ones it moves; the build order then follows the subclass. When attributes it does not redeclare are caught between reordered ones the result is ambiguous, so get_config_attributes raises a ConfigurationError naming them. Folds the per-class splice into get_config_attributes so the whole ordering rule lives in one function. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ordering Raise a dedicated AttributeOrderError (a ConfigurationError subclass) for an ambiguous attribute-reordering instead of a bare ConfigurationError, document the ordering algorithm as bullet points on get_config_attributes, cover the rules in test_configuration, and add an "Attribute ordering" section with examples to the configuration docs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The fallback that recovers attributes registered on a class's merged set but absent from its __dict__ also re-added attributes a subclass had unset (e.g. Rhomboid.dimensions, which Layer drops), because the unset key was popped from the working set but still present on an ancestor's _config_attrs. Track unset keys and skip them in the fallback. Adds a regression test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… are built load_simulation_modules now stops at the ancestor that owns the `modules` attribute and raises if it is not built yet, instead of walking past it and silently validating cell/connection models against a kernel missing the simulation's modules. The build order guarantees `modules` precedes the models today; this makes a future reorder fail loudly rather than silently. Adds tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@drodarie thanks, you were right about the ordering not being guaranteed. Fixed it at the source rather than working around it: Attribute build order is now declaration-order-preserving ( A subclass may also reorder inherited attributes by redeclaring them, but only if it redeclares every attribute caught between the ones it moves; otherwise the order is ambiguous and a new Defensive guard (d7f64b0): Also fixed a regression the ordering change introduced: the fallback that recovers dynamically-registered attributes was resurrecting attributes a subclass had unset ( Re-requesting review. |
The build-time kernel proxy used a multiprocessing BaseManager, whose child inherits (fork) or re-imports (spawn) the parent's NEST; installing a module there crashes once NEST has been imported in the parent, which the bsb_nest plugin always does. Launch the kernel by file path as an independent subprocess that imports nest only on its own main thread and answers over a multiprocessing.connection pipe, so a third party may import nest before the BSB without affecting the kernel subprocess. A thin _KernelProxy keeps the load_modules / has_delay / models interface; load_simulation_modules and the model type handlers are unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a minimal NESTML neuron and a test that builds it into a NEST module (skipping when pynestml, NEST, or the build toolchain is unavailable), then checks it loads in the out-of-process kernel proxy, that a module-provided cell model validates while an unknown one is rejected, and that loading it never leaks into the main process's NEST. Add nestml to the test deps and bump CI NEST to 3.8 to match. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`nestml~=8.3` excludes pre-releases, but the only published 8.3 is 8.3.0rc3, so `uv sync` found no solution and failed every bsb-nest task (lint, docs, build). Pin to `~=8.3.0rc2` (as cerebellar-models does) so the release candidate resolves. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Both the model type handlers and the delay-required checker need to reach the build's out-of-process kernel, install the enclosing simulation's modules, run a query, and degrade gracefully when the kernel is unreachable. Factor that into `query_kernel`; `NestModelTypeHandler` and `_is_delay_required` now just pass a query and a fallback instead of each reimplementing the scaffolding. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
NEST >= 3.8 requires Boost (>= 1.69), so its CMake configure failed without it, leaving no nest_vars.sh and breaking the test step. Add libboost-dev. Also include NEST_VERSION in the NEST cache key so bumping the version no longer restores a stale install. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
NEST's CMake invokes Python helpers; when install-nest runs as a dependency of
the OTEL-instrumented `nx test`, the auto-instrumentation injected into those
helpers writes to stderr and breaks NEST's configure step ("Configuring
incomplete"). nx propagates the wrapper's environment to dependency tasks, so a
separate target alone doesn't insulate it. Build NEST in its own step before the
instrumented test step; the test target's install-nest dependency then finds
NEST installed and exits early.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When the test suite runs under `mpiexec`, every rank inherits the launcher's environment (OMPI_*/PMI_*/PMIX_*). The out-of-process kernel subprocess would inherit it too, so its `import nest` called MPI_Init and blocked forever trying to enroll as a rank of the parent job, hanging the build. Strip those variables (and OTEL_*, plus set OTEL_SDK_DISABLED) so the kernel runs as an independent, uninstrumented process. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…pped NESTML compiles its module with CMake, which builds and runs small NEST-linked helper programs to probe the kernel. Under `mpiexec` those helpers inherited the launcher environment and blocked on MPI_Init, hanging the suite. Build the module once on the main rank (the others wait on `bcast`) and run the build with the MPI launcher variables stripped so the helpers run as plain processes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@drodarie this round is mostly CI hardening on top of the proxy + attribute-ordering fixes; it should be ready for another look. The out-of-process kernel now survives MPI. The bsb-nest suite runs under NESTML extension-module coverage. Added a test that builds a minimal NESTML module and loads it through the proxy, asserting the model lands in the out-of-process kernel and never touches the main process. The build itself (CMake compiling and running NEST helper programs) hit the same NEST 3.8 and CI plumbing. Bumped NEST to 3.8 (added Verified locally against a from-source NEST 3.8 build, both single-process and under |
drodarie
left a comment
There was a problem hiding this comment.
So now we have a build context which stores a proxy that interfaces with an isolated server that contains a NEST kernel.
I wonder if we did not get too deep on this one ;)
|
Too deep? This is a perfect precedent for dealing with simulators that store global state 👍 |
Summary
Closes #227 — loading a
NestSimulationconfig no longer mutates NEST's global kernel state in the user's process.bsb.config.BuildContext(ContextVar-backed, attribute-style namespace with LIFO cleanup callbacks). The root build activates it for the whole construction sequence inwrap_root_postnew.bsb_nest.get_nest_kernel_proxy()lazily spawns amultiprocessing.managers.BaseManagersubprocess that imports NEST, registers the proxy atctx.bsb_nest.kernel, and shuts down on context exit. The proxy is what keeps config building from touching the in-process kernel.NestSynapseSettings.delayis back to a callablerequired=(_is_delay_required) — same intent as the pre-fix: Loading modules to check NEST models #228_is_delay_requiredcallable, but routed through the out-of-process proxy.import neststays at module top in the runtime modules (adapter, cell, device, devices).distributions.pykeeps a lazy import via_LazyDistributionNames, because it would otherwise import NEST at module-load time just to enumerate distribution names for an attribute validator.Validation semantics
The checker separates two failure modes:
ConfigurationError(matches the pre-fix: Loading modules to check NEST models #228 boot-time check — config still rejects bad model names at build time)Falseso config loading stays robust; the real NEST error surfaces later at adapter prepare/connect timeThe "delay needed?" answer itself is always soft — an unreachable proxy never blocks a valid config.
Note: post-build mutation (
cfg.simulations[...] = ...) runs outside the build context, so the hard model-validity check requires wrapping the mutation inwith bsb.config.build_context():if you want strict validation; otherwise the bad model surfaces at sim time.Status — please review the design
@drodarie — could you sanity-check this against the cerebellar models? Specifically:
delayrequirement is now soft when the proxy can't be reached; the model-name validation stays hard when the proxy IS reachable. For your typical config-mutation flows, do you need help wrapping them inbuild_context(), or should the proxy spawn even outside the root build?BuildContextand shuts down on exit. With your larger configs, is the per-build subprocess startup cost (a few tenths of a second) an issue, or should we look at reuse?Happy to iterate on any of the three.
Test plan
pytest packages/bsb-core/tests/test_build_context.py(9 tests, all pass)pytest packages/bsb-nest/tests/test_build_context.py(7 tests, all pass — proxy lifecycle, delay-required/optional, unknown-model hard error, and the no-context / proxy-unreachable fallbacks)bsb-coresuite: 478 pass, 11 skipped, 0 failurestest_error_gap_junctions_syn(now expectsRequirementErrorinsidebuild_context()),test_unknown_synapse(now expectsConfigurationErrorinsidebuild_context()),test_unknown_modules,test_unknown_cell— all pass🤖 Generated with Claude Code
📚 Documentation preview 📚: https://bsb-nest--235.org.readthedocs.build/en/235/
📚 Documentation preview 📚: https://bsb--235.org.readthedocs.build/en/235/
📚 Documentation preview 📚: https://bsb-core--235.org.readthedocs.build/en/235/