Skip to content

feat(config): out-of-process NEST kernel for config-build-time checks (#227)#235

Merged
Helveg merged 35 commits into
mainfrom
worktree-issue-227-build-context
May 30, 2026
Merged

feat(config): out-of-process NEST kernel for config-build-time checks (#227)#235
Helveg merged 35 commits into
mainfrom
worktree-issue-227-build-context

Conversation

@Helveg
Copy link
Copy Markdown
Contributor

@Helveg Helveg commented May 27, 2026

Summary

Closes #227 — loading a NestSimulation config no longer mutates NEST's global kernel state in the user's process.

  • New bsb.config.BuildContext (ContextVar-backed, attribute-style namespace with LIFO cleanup callbacks). The root build activates it for the whole construction sequence in wrap_root_postnew.
  • bsb_nest.get_nest_kernel_proxy() lazily spawns a multiprocessing.managers.BaseManager subprocess that imports NEST, registers the proxy at ctx.bsb_nest.kernel, and shuts down on context exit. The proxy is what keeps config building from touching the in-process kernel.
  • NestSynapseSettings.delay is back to a callable required= (_is_delay_required) — same intent as the pre-fix: Loading modules to check NEST models #228 _is_delay_required callable, but routed through the out-of-process proxy.
  • import nest stays at module top in the runtime modules (adapter, cell, device, devices). distributions.py keeps a lazy import via _LazyDistributionNames, because it would otherwise import NEST at module-load time just to enumerate distribution names for an attribute validator.

Validation semantics

The checker separates two failure modes:

  • Proxy reachable, model unknown → hard ConfigurationError (matches the pre-fix: Loading modules to check NEST models #228 boot-time check — config still rejects bad model names at build time)
  • Proxy unreachable (no build context, kernel spawn failed, IPC error) → warn + return False so config loading stays robust; the real NEST error surfaces later at adapter prepare/connect time

The "delay needed?" answer itself is always soft — an unreachable proxy never blocks a valid config.

Note: post-build mutation (cfg.simulations[...] = ...) runs outside the build context, so the hard model-validity check requires wrapping the mutation in with bsb.config.build_context(): if you want strict validation; otherwise the bad model surfaces at sim time.

Status — please review the design

@drodarie — could you sanity-check this against the cerebellar models? Specifically:

  1. Does config loading still succeed end-to-end?
  2. The synapse-delay requirement is now soft when the proxy can't be reached; the model-name validation stays hard when the proxy IS reachable. For your typical config-mutation flows, do you need help wrapping them in build_context(), or should the proxy spawn even outside the root build?
  3. The proxy spawns once per BuildContext and shuts down on exit. With your larger configs, is the per-build subprocess startup cost (a few tenths of a second) an issue, or should we look at reuse?

Happy to iterate on any of the three.

Test plan

  • pytest packages/bsb-core/tests/test_build_context.py (9 tests, all pass)
  • pytest packages/bsb-nest/tests/test_build_context.py (7 tests, all pass — proxy lifecycle, delay-required/optional, unknown-model hard error, and the no-context / proxy-unreachable fallbacks)
  • Full bsb-core suite: 478 pass, 11 skipped, 0 failures
  • Updated test_error_gap_junctions_syn (now expects RequirementError inside build_context()), test_unknown_synapse (now expects ConfigurationError inside build_context()), test_unknown_modules, test_unknown_cell — all pass
  • @drodarie validation against cerebellar models

🤖 Generated with Claude Code


📚 Documentation preview 📚: https://bsb-nest--235.org.readthedocs.build/en/235/


📚 Documentation preview 📚: https://bsb--235.org.readthedocs.build/en/235/


📚 Documentation preview 📚: https://bsb-core--235.org.readthedocs.build/en/235/

…#227)

Building a `NestSimulation` config used to mutate NEST's global kernel
state in the user's process via `NestSynapseSettings.__boot__` (which
called `nest.GetDefaults`). This change introduces a `ContextVar`-backed
`BuildContext` set by the root build, into which `bsb-nest` lazily
registers a `multiprocessing.managers.BaseManager`-backed proxy that
runs NEST in a subprocess. The `delay` `required=` checker queries the
proxy instead, so config loading no longer imports NEST in-process.

Highlights:

- `bsb.config.build_context()` / `BuildContext` / `get_/set_config_build_context()`,
  with LIFO cleanup callbacks; `wrap_root_postnew` activates it for the
  whole root build incl. `_resolve_references`.
- `NodeKwargs` carries the node `instance` so `required=` callables can
  walk parents (used to discover the enclosing `NestSimulation.modules`).
- `bsb_nest.get_nest_kernel_proxy()` spawns a `NestKernelManager`
  subprocess on first need, stores the proxy at `ctx.bsb_nest.kernel`,
  and shuts it down on context exit.
- `NestSynapseSettings.delay` is back to a callable `required=`
  (`_is_delay_required`). Any failure to reach the proxy or look up the
  model warns + falls back to `required=False`, matching the spec.
- Top-level `import nest` statements in `bsb_nest` are made lazy so that
  `import bsb_nest` no longer drags NEST into the user's process.

Behaviour change: the unknown-model and "model needs delay" checks used
to hard-fail at boot via `ConfigurationError`; they now warn at config
time and surface later (e.g. at `nest.Connect`). Updated
`test_unknown_synapse` and `test_error_gap_junctions_syn` accordingly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the feat label May 27, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

❌ Patch coverage is 81.84818% with 55 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.42%. Comparing base (80ffbd5) to head (d4bfc4d).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
packages/bsb-nest/bsb_nest/_kernel_server.py 28.30% 37 Missing and 1 partial ⚠️
packages/bsb-nest/bsb_nest/_kernel_proxy.py 92.50% 6 Missing and 3 partials ⚠️
packages/bsb-core/bsb/config/_make.py 90.62% 4 Missing and 2 partials ⚠️
packages/bsb-core/bsb/config/_build_context.py 94.11% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #235      +/-   ##
==========================================
+ Coverage   80.54%   85.42%   +4.87%     
==========================================
  Files         188      103      -85     
  Lines       18179    11813    -6366     
  Branches     2174     1387     -787     
==========================================
- Hits        14642    10091    -4551     
+ Misses       2977     1415    -1562     
+ Partials      560      307     -253     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Helveg and others added 12 commits May 27, 2026 18:57
The previous commit lumped "unknown model" together with "can't reach the
kernel" — both warned and fell back to required=False. But the two
failure modes are conceptually different: if we *can* reach the kernel
and it tells us the model doesn't exist, that's a real config error and
deserves the same hard fail that `NestSynapseSettings.__boot__` used to
give pre-#228. The soft warn-and-fall-back is reserved for cases where
we genuinely can't reach the proxy (no build context, kernel spawn
failed, IPC error).

- `_is_delay_required` now raises `ConfigurationError` for unknown
  models when the proxy lookup succeeded; only proxy-unreachable /
  IPC-failure paths warn and return False.
- `test_unknown_synapse` in test_nest.py expects `ConfigurationError`
  again (was temporarily relaxed to `assertWarns(KernelWarning)`).
- `test_unknown_synapse_warns_and_falls_back` renamed to
  `test_unknown_synapse_is_hard_error_when_proxy_reachable` and updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- `_build_context.py`: drop the private `_AutoNamespace` base class so
  Sphinx's autodoc doesn't trip on the unresolved cross-reference; inline
  the auto-vivification into `BuildContext.__getattr__` and use
  `contextlib.suppress` for the cleanup callback's `try/except/pass`.
- Combine nested `with` blocks across the test files (SIM117).
- Add `raise ... from e` to the NEST skip in test_build_context.py (B904).
- Ruff also reorganised the `bsb` import block in `connection.py` (line
  length) and removed an unused `KernelWarning` import from test_nest.py
  that the prior commit left behind.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test fetches the Allen mouse brain ontology from
api.brain-map.org and fails with FileNotFoundError when the API
is unreachable (currently returning 503). Same `skip_test_allen_api`
helper that already guards `TestAllenVoxels` works here — apply it
at the method level since the rest of `TestNrrdVoxels` is
network-independent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new `config/build-lifecycle.rst` covering the four-phase
config load (parse → build → boot → use), the essential `__post_new__`
and `__boot__` hooks, and the `BuildContext` API introduced by #235:
how to register and look up shared resources on the active build
context, the `__dict__.get(...)` idiom for non-vivifying leaf reads,
and how `required=` callables consume it. Includes the bsb_nest NEST
proxy as a worked example.

Also fills in a previously undocumented capability — `required=` accepts
a callable, not just a bool — in `config/attributes.rst`, and links it
to the new build-lifecycle page.

Wires the new page into the configuration toctree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
House rule: no em dashes in docs. Rewrote the parenthetical asides in
build-lifecycle.rst and attributes.rst with colons, commas, and
parentheses instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the Conventions section of docs/dev/documentation.rst with the
house style rules that were previously only tribal knowledge: no em/en
dashes, no development-history references in committed text, and "the
BSB" in prose. Adds a root CLAUDE.md that points agents at that section
and states the mechanical pre-finish grep check for stray dashes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The conventions live in docs/dev/documentation.rst; CLAUDE.md just
points there instead of restating the rules.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Claude files live under .claude/ (alongside package-boilerplate.md and
skills/), so the pointer file belongs there too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the convention to docs/dev/documentation.rst and applies it to the
build-lifecycle refs so they render the final component (e.g.
BuildContext) instead of the full dotted path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the "required= checkers" and "NEST kernel as a proxy" sections
from build-lifecycle.rst; they were too feature-specific for a general
lifecycle page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The lazy in-method `import nest` refactor was unnecessary: bsb-nest
depends on nest, so importing it at module top is fine. The out-of-process
kernel proxy is what keeps config building from mutating the in-process
kernel; avoiding the import itself was extra scope. Restores top-level
`import nest` across adapter, cell, device, distributions, and the device
plugins, and drops the now-moot "import bsb_nest is nest-free" test.

Also removes the `if __name__ == "__main__": unittest.main()` blocks from
the new test modules; not a pattern used here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
distributions.py builds its distribution-name validation list at import
time, so a top-level nest import would pull NEST in just to enumerate
names during config attribute setup. Keep the deferred _LazyDistributionNames
so that computation only touches NEST when a name is actually validated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@drodarie
Copy link
Copy Markdown
Contributor

drodarie commented May 28, 2026

@Helveg, It seems that the __boot__ function of the NestSimulation class is not using the context proxy? Same for the other boot functions. Should they not use the proxy instead?
My point is that if I do:

scaffold = from_storage("my_network.hdf5")
import nest 

nest.Models() # contains the module that should be loaded only by the simulation adatper

Helveg and others added 5 commits May 28, 2026 15:00
Clarifies that the handle is the node mid-construction: its identity,
parent, and key are set, but its attributes are not yet assigned when the
requirement check runs. Adds a docstring documenting the depth-first,
parent-first build order and that ancestors are only partially built.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves the "node is only partially built during a required= check" caveat
into a `.. warning::` admonition so it stands out, including the
depth-first/parent-first ordering and the declared-before-branch rule for
sibling attribute visibility.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tightens the admonition and clarifies the key reassurance: build order
follows attribute declaration order, not the order keys appear in the
user's configuration, so parent-attribute availability is deterministic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Helveg Helveg marked this pull request as ready for review May 28, 2026 17:24
@Helveg Helveg requested a review from drodarie May 28, 2026 17:24
Helveg and others added 3 commits May 28, 2026 21:50
Add `_NestKernel.load_modules`, which installs a simulation's modules into
the out-of-process kernel only once per build and reports the ones it can't
find, plus a shared `load_simulation_modules` helper that walks up to the
enclosing simulation and raises a `ConfigurationError` for a missing module.
The synapse delay checker now uses the helper instead of its own install loop.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…onfigs

`NestSimulation.__boot__` installed the configured modules into the in-process
kernel during `from_storage`, so merely loading a network changed `nest`'s
global state. Drop the hook; modules are installed at adapter prepare time and,
for config-time validation, into the out-of-process proxy. Validate cell models
through a `nest_node_model` type handler that queries the proxy, mirroring the
synapse model check, instead of a boot-time in-process `nest.Models` lookup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Factor the cell model validator into a `NestModelTypeHandler` base on the
kernel proxy and add a `nest_synapse_model` subclass so synapse models are
validated the same way: on the `model` attribute rather than as a side effect
of the `delay` `required=` checker, which now only answers whether a delay is
required.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Helveg
Copy link
Copy Markdown
Contributor Author

Helveg commented May 28, 2026

@drodarie good catch, you were right. I audited every boot/__boot__ hook in bsb-nest (they all run during from_storage, since _boot_nodes walks the whole config tree, simulations included):

  • NestSimulation.__boot__ did nest.Install(...) in-process. That's exactly your repro: loading a network mutated the in-process kernel, so nest.Models() showed the simulation's modules. Removed (3b65aae). Modules are still installed at adapter prepare() time, and into the out-of-process proxy for config-time validation.
  • NestCell.__boot__ did an in-process nest.Models(mtype="nodes") lookup. Replaced with a nest_node_model type handler on NestCell.model that queries the out-of-process proxy.
  • The device boot() hooks (Multimeter, SinusoidalPoissonGenerator) only do config validation and never touch nest.

While in there I also:

  • Made synapse validation symmetric: a nest_synapse_model type handler on NestSynapseSettings.model (4d0fbc0), so _is_delay_required now only answers the delay question.
  • Centralized and deduped module loading on the proxy (load_modules / load_simulation_modules); a nonexistent module is now a hard ConfigurationError at build time (04ce8ef).

Net effect: loading a config validates cell + synapse models and installs modules entirely out-of-process, so the in-process nest kernel is no longer touched at config-build time. The module-level import nest statements in the simulation-time components are left as-is, since importing nest is harmless; it is mutating the in-process kernel that we avoid.

test_unknown_cell / test_unknown_modules are updated to assert their errors under build_context(), like test_unknown_synapse.

Copy link
Copy Markdown
Contributor

@drodarie drodarie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite a technical solution, I think I understand the idea behind it but there is a major flaw: node attributes initialization order is not guaranteed because they are part of a dict. But here we need the modules attribute of NestSimulation to be set so that it can be used for cell_models or connection_models

Comment thread packages/bsb-nest/bsb_nest/_kernel_proxy.py
Comment thread packages/bsb-nest/bsb_nest/_kernel_proxy.py Outdated
Helveg and others added 2 commits May 29, 2026 12:15
Attribute build order is taken from get_config_attributes, which merged each
class's attributes parent-first. An overridden attribute kept its parent
position, so a subclass attribute (e.g. NestSimulation.modules) declared before
an overridden one (cell_models) was built after it. Build order is unrelated to
the order keys appear in the user's config, so a checker that reads a sibling
set earlier in the subclass body could not rely on it being assigned yet.

Splice each class's declared attributes into the inherited order instead:
overrides keep their inherited slot and a subclass's new attributes are threaded
in at the position the subclass declares them, so every class's declaration
order is preserved as a subsequence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A subclass can now reorder inherited attributes by redeclaring them, as long as
it redeclares every attribute that sits between the ones it moves; the build
order then follows the subclass. When attributes it does not redeclare are
caught between reordered ones the result is ambiguous, so get_config_attributes
raises a ConfigurationError naming them. Folds the per-class splice into
get_config_attributes so the whole ordering rule lives in one function.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Helveg and others added 3 commits May 29, 2026 12:43
…ordering

Raise a dedicated AttributeOrderError (a ConfigurationError subclass) for an
ambiguous attribute-reordering instead of a bare ConfigurationError, document
the ordering algorithm as bullet points on get_config_attributes, cover the
rules in test_configuration, and add an "Attribute ordering" section with
examples to the configuration docs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The fallback that recovers attributes registered on a class's merged set but
absent from its __dict__ also re-added attributes a subclass had unset (e.g.
Rhomboid.dimensions, which Layer drops), because the unset key was popped from
the working set but still present on an ancestor's _config_attrs. Track unset
keys and skip them in the fallback. Adds a regression test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… are built

load_simulation_modules now stops at the ancestor that owns the `modules`
attribute and raises if it is not built yet, instead of walking past it and
silently validating cell/connection models against a kernel missing the
simulation's modules. The build order guarantees `modules` precedes the models
today; this makes a future reorder fail loudly rather than silently. Adds tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Helveg
Copy link
Copy Markdown
Contributor Author

Helveg commented May 29, 2026

@drodarie thanks, you were right about the ordering not being guaranteed. Fixed it at the source rather than working around it:

Attribute build order is now declaration-order-preserving (get_config_attributes, 92e6bc1 / 3623501). Walking the MRO base-first, each class's declaration order is kept as a subsequence: an overridden attribute keeps its inherited slot, and a subclass's new attributes are spliced in at the position the subclass declares them. So NestSimulation.modules (declared before cell_models / connection_models) is now built before them, which is what the model validators rely on.

A subclass may also reorder inherited attributes by redeclaring them, but only if it redeclares every attribute caught between the ones it moves; otherwise the order is ambiguous and a new AttributeOrderError is raised naming the attributes to redeclare. Covered in test_configuration and documented in an "Attribute ordering" section.

Defensive guard (d7f64b0): load_simulation_modules now stops at the ancestor that owns modules and raises if it is not built yet, instead of walking past it and silently validating models against a kernel missing the modules. If the build order is ever changed, this fails loudly with a message explaining modules must precede cell_models / connection_models.

Also fixed a regression the ordering change introduced: the fallback that recovers dynamically-registered attributes was resurrecting attributes a subclass had unset (Layer.dimensions); now tracked and skipped, with a regression test (6051edf).

Re-requesting review.

@Helveg Helveg requested a review from drodarie May 29, 2026 12:42
drodarie and others added 9 commits May 29, 2026 17:35
The build-time kernel proxy used a multiprocessing BaseManager, whose child
inherits (fork) or re-imports (spawn) the parent's NEST; installing a module
there crashes once NEST has been imported in the parent, which the bsb_nest
plugin always does. Launch the kernel by file path as an independent subprocess
that imports nest only on its own main thread and answers over a
multiprocessing.connection pipe, so a third party may import nest before the BSB
without affecting the kernel subprocess. A thin _KernelProxy keeps the
load_modules / has_delay / models interface; load_simulation_modules and the
model type handlers are unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a minimal NESTML neuron and a test that builds it into a NEST module
(skipping when pynestml, NEST, or the build toolchain is unavailable), then
checks it loads in the out-of-process kernel proxy, that a module-provided cell
model validates while an unknown one is rejected, and that loading it never
leaks into the main process's NEST. Add nestml to the test deps and bump CI NEST
to 3.8 to match.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`nestml~=8.3` excludes pre-releases, but the only published 8.3 is 8.3.0rc3, so
`uv sync` found no solution and failed every bsb-nest task (lint, docs, build).
Pin to `~=8.3.0rc2` (as cerebellar-models does) so the release candidate
resolves.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Both the model type handlers and the delay-required checker need to reach the
build's out-of-process kernel, install the enclosing simulation's modules, run
a query, and degrade gracefully when the kernel is unreachable. Factor that into
`query_kernel`; `NestModelTypeHandler` and `_is_delay_required` now just pass a
query and a fallback instead of each reimplementing the scaffolding.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
NEST >= 3.8 requires Boost (>= 1.69), so its CMake configure failed without it,
leaving no nest_vars.sh and breaking the test step. Add libboost-dev. Also
include NEST_VERSION in the NEST cache key so bumping the version no longer
restores a stale install.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
NEST's CMake invokes Python helpers; when install-nest runs as a dependency of
the OTEL-instrumented `nx test`, the auto-instrumentation injected into those
helpers writes to stderr and breaks NEST's configure step ("Configuring
incomplete"). nx propagates the wrapper's environment to dependency tasks, so a
separate target alone doesn't insulate it. Build NEST in its own step before the
instrumented test step; the test target's install-nest dependency then finds
NEST installed and exits early.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When the test suite runs under `mpiexec`, every rank inherits the launcher's
environment (OMPI_*/PMI_*/PMIX_*). The out-of-process kernel subprocess would
inherit it too, so its `import nest` called MPI_Init and blocked forever trying
to enroll as a rank of the parent job, hanging the build. Strip those variables
(and OTEL_*, plus set OTEL_SDK_DISABLED) so the kernel runs as an independent,
uninstrumented process.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…pped

NESTML compiles its module with CMake, which builds and runs small NEST-linked
helper programs to probe the kernel. Under `mpiexec` those helpers inherited
the launcher environment and blocked on MPI_Init, hanging the suite. Build the
module once on the main rank (the others wait on `bcast`) and run the build
with the MPI launcher variables stripped so the helpers run as plain processes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Helveg
Copy link
Copy Markdown
Contributor Author

Helveg commented May 29, 2026

@drodarie this round is mostly CI hardening on top of the proxy + attribute-ordering fixes; it should be ready for another look.

The out-of-process kernel now survives MPI. The bsb-nest suite runs under mpiexec -n 2, and the kernel subprocess was inheriting the launcher's OMPI_*/PMI_*/PMIX_* environment. Its import nest therefore called MPI_Init and blocked forever trying to enroll as a rank of the parent job, which is what was timing CI out at 60 minutes. The subprocess now starts from a stripped environment, so it runs as an independent, uninstrumented NEST process no matter how the parent was launched. (96401d6)

NESTML extension-module coverage. Added a test that builds a minimal NESTML module and loads it through the proxy, asserting the model lands in the out-of-process kernel and never touches the main process. The build itself (CMake compiling and running NEST helper programs) hit the same MPI_Init wall under mpiexec, so it builds once on the main rank with the launcher environment stripped while the other ranks wait on bcast. (d4bfc4d)

NEST 3.8 and CI plumbing. Bumped NEST to 3.8 (added libboost, versioned the cache key) and moved the NEST build ahead of the opentelemetry-instrument-wrapped test step, so the auto-instrumentation no longer corrupts NEST's CMake Python helpers.

Verified locally against a from-source NEST 3.8 build, both single-process and under mpiexec -n 2: the full bsb-nest suite is green (35 passing serially; 14 run plus 21 parallel-skips under MPI, no hang). CI is green across the board.

Copy link
Copy Markdown
Contributor

@drodarie drodarie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now we have a build context which stores a proxy that interfaces with an isolated server that contains a NEST kernel.

I wonder if we did not get too deep on this one ;)

@Helveg
Copy link
Copy Markdown
Contributor Author

Helveg commented May 30, 2026

Too deep? This is a perfect precedent for dealing with simulators that store global state 👍

@Helveg Helveg merged commit a49b943 into main May 30, 2026
29 checks passed
@Helveg Helveg deleted the worktree-issue-227-build-context branch May 30, 2026 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make NestSimulation configuration "NEST-kernel-free"

2 participants