Skip to content

mesh: isolate daemon config under ~/.myownllm/.myownmesh/ subdir#208

Merged
mrjeeves merged 2 commits into
mainfrom
claude/charming-turing-PMIK8
May 28, 2026
Merged

mesh: isolate daemon config under ~/.myownllm/.myownmesh/ subdir#208
mrjeeves merged 2 commits into
mainfrom
claude/charming-turing-PMIK8

Conversation

@mrjeeves
Copy link
Copy Markdown
Owner

Symptom

Saving a network in MyOwnLLM (Settings → Networks → Status → "+ Add network" → Save & activate) appeared to work within a session, but after restart every LLM setting was reset to defaults — providers, active family, every saved network, accepting policy, agent permissions, prompts, the auto-gossip toggle, all gone. The network the user just saved also vanished from the LLM's UI.

The user reported it as "is network saving broken?".

Root cause

The bundled myownmesh serve daemon and the LLM were both writing to ~/.myownllm/config.json with different, incompatible schemas:

  • LLM Config: {providers, active_family, cloud_mesh.networks, prompts under networks[*].prompts, agent_permissions under networks[*].agent_permissions, auto_update, ...}
  • daemon's myownmesh_core::MeshConfig: {version, identity_path, auto_update, auto_cleanup, daemon, networks} — note: top-level networks array, no cloud_mesh wrapper.

PR #203's daemon.rs set MYOWNMESH_HOME=~/.myownllm so the daemon shared the LLM's directory for identity + rosters. That sharing worked for those files but had the side effect of pointing the daemon's {MYOWNMESH_HOME}/config.json at the same path as the LLM's ~/.myownllm/config.json — where they have incompatible schemas.

MeshConfig::load() doesn't use #[serde(deny_unknown_fields)], so every LLM-only key deserialized as unknown-and-silently-ignored. Then any NetworkAdd IPC call (the user clicking Save & activate) triggered the daemon's persist_network_add:

let mut cfg = MeshConfig::load().map_err(...)?;   // strips LLM keys on parse
cfg.networks.push(net.clone());
cfg.save().map_err(...)?;                          // writes only MeshConfig back

~/.myownllm/config.json would flip from LLM shape to daemon shape mid-session, wiping providers / active_family / cloud_mesh / prompts / permissions / auto_gossip / accepting. The LLM's in-memory _cached config kept the original LLM shape, so the user saw no damage until next launch. On restart, loadConfig read the daemon shape, mergeDefaults filled in LLM defaults from DEFAULT_CONFIG, and the user's settings appeared as factory defaults. cloud_mesh.networks ended up empty (the daemon's networks field is at top level, not inside cloud_mesh), so the user's saved network disappeared from the LLM's saved-networks list.

Fix

Isolate the daemon under a dedicated subdirectory: ~/.myownllm/.myownmesh/. The daemon's config.json + updates/ live there; the LLM's ~/.myownllm/config.json is no longer in the daemon's MYOWNMESH_HOME tree, so the daemon's persist path can't reach it. Identity, rosters, and signed governance states get moved into the subdir on first launch so existing users keep their pubkey + peer approvals — losing identity continuity would orphan every user's Device ID and force every paired peer through a fresh approval round.

New file: src-tauri/src/mesh/migration.rs

migrate_daemon_state_into_subdir(llm_dir, daemon_home):

  • Moves .secrets/identity.json — the ed25519 keypair. This is the critical one; everything else is recoverable but a new pubkey would split the user's mesh across every peer.
  • Moves mesh/rosters/*.json — per-network approved peers. Without these, every paired peer would re-prompt on next handshake.
  • Moves mesh/states/*.json — signed governance-state files for closed networks.

Idempotent (checks identity_dst.exists() at the top); destination collisions leave both files in place and log to stderr; cross-filesystem moves handled via copy + remove fallback. Four unit tests cover happy path, idempotence, fresh-install no-op, and destination collision.

src-tauri/src/main.rs

The setup hook runs the migration before setting MYOWNMESH_HOME, then points the env var at the subdir. Comments spell out the bug we're fixing so the next person who touches this path doesn't accidentally regress it.

src-tauri/src/mesh/daemon.rs

  • socket_for_mode(OwnLlm) now returns ~/.myownllm/.myownmesh/daemon.sock (matching the daemon's data_dir()/daemon.sock calculation under the new MYOWNMESH_HOME).
  • Spawn path's home calculation uses the subdir.
  • Shared mode untouched — that's the MyOwnMesh GUI's own ~/.myownmesh/daemon.sock and lives outside the LLM's tree entirely.
  • Module-level + enum-variant doc updated to match.

src/config.ts — recovery for users who already hit the bug

Their config.json is currently the daemon's shape. salvageDaemonShapeLeakage runs at the top of mergeDefaults:

  1. Detect: top-level networks array whose entries all carry id + network_id, AND cloud_mesh.networks empty/absent.
  2. Convert each daemon NetworkConfig to LLM NetworkConfig shape (signaling.servers → signaling_servers; stun_servers[].urls flattened; turn_servers' first url + auth lifted). LLM-only fields (accepting, agent_permissions, prompts, auto_gossip) default via mergeNetwork.
  3. Strip daemon-only top-level keys (version, identity_path, daemon, networks) so subsequent saves are clean LLM shape.

auto_update and auto_cleanup are shared between both schemas with compatible field names — left alone; the LLM's mergeDefaults handles them with its own per-field merge. No-op when nothing matches the detection signature — fresh installs + uncorrupted configs are untouched.

src-tauri/Cargo.toml

Added tempfile = "3" under [dev-dependencies] for the migration tests.

Why not patch myownmesh-core to preserve unknown fields?

That would also work but requires a MyOwnMesh release + .myownmesh-rev bump on the LLM side, and ages slowly through any user running a shared daemon that pre-dates the patch. The subdir isolation is a self-contained fix on the LLM side and prevents the same class of bug from recurring if the daemon's schema grows new top-level fields in the future.

What's unrecoverable for affected users

The salvage in config.ts restores what the daemon kept — the network IDs and transport config. But the daemon's NetworkConfig doesn't carry the LLM's accepting, agent_permissions, prompts, or auto_gossip fields, so those reset to fresh defaults. Providers and active_family also reset because the daemon's write destroyed them. The user's identity (identity.json) and rosters survive untouched — those weren't in config.json.

Validation

  • pnpm run check: 164 files, 0 errors, 0 warnings.
  • pnpm run build: clean.
  • cargo test --bins -p myownllm of the new migration module's four unit tests. (Sandbox lacks gdk-3.0 so the full crate doesn't build here; please verify locally.)
  • Fresh install of the new build, no existing data: app launches, daemon spawns, ~/.myownllm/.myownmesh/.secrets/identity.json appears, ~/.myownllm/config.json stays clean LLM shape.
  • Upgrade from a pre-fix build with existing identity + rosters: identity file relocates into the subdir; pubkey + peer approvals carry over; old ~/.myownllm/.secrets/identity.json is gone.
  • Upgrade from a pre-fix build that already hit the bug (daemon-shape config.json): saved networks recovered into cloud_mesh.networks, daemon-only top-level keys stripped, providers/active_family reset to defaults (irrecoverable).
  • Save & activate a network on the new build: ~/.myownllm/config.json retains full LLM shape across the operation; daemon's writes land in ~/.myownllm/.myownmesh/config.json only.
  • Switch active networks between two saved networks: both files update independently; the LLM's cloud_mesh.active_network_id and the daemon's networks list stay in sync via reconcile.

https://claude.ai/code/session_01RLu1LdTgtxEDdzhybzqFrk


Generated by Claude Code

claude added 2 commits May 28, 2026 07:51
## Symptom

Saving a network in MyOwnLLM (Settings → Networks → Status →
"+ Add network" → Save & activate) "worked" within a session
but every LLM setting was reset to defaults on the next launch —
providers, active family, all saved networks, accepting policy,
agent permissions, prompts, auto-gossip toggles, everything.
The network the user just saved was also gone from the LLM's UI.

## Root cause

The bundled `myownmesh serve` daemon and the LLM both wrote to
`~/.myownllm/config.json` with different schemas:

- LLM Config: `{providers, active_family, cloud_mesh.networks, ...}`
- daemon's `myownmesh_core::MeshConfig`: `{version, identity_path,
  auto_update, auto_cleanup, daemon, networks}` (note: top-level
  `networks`, no `cloud_mesh` wrapper)

PR #203's `daemon.rs` set `MYOWNMESH_HOME=~/.myownllm` so the
daemon shared the LLM's directory for identity + rosters. That
sharing worked for those files but pointed the daemon's
`config.json` at the same path as the LLM's, where they have
incompatible schemas.

`MeshConfig::load()` doesn't use `#[serde(deny_unknown_fields)]`,
so the LLM's keys deserialized silently as unknown-and-ignored.
Then any `NetworkAdd` IPC call triggered:

```
MeshConfig::load()  // strips every LLM-only key on parse
  .networks.push(new)
  .save()  // writes only MeshConfig's fields back
```

Result: `~/.myownllm/config.json` flips from LLM shape to daemon
shape mid-session, wiping providers / active_family / cloud_mesh /
prompts / permissions / auto_gossip / accepting / etc. The
in-memory `_cached` config keeps the LLM shape so the user sees
no damage until restart; on next load, `loadConfig` reads the
daemon shape, `mergeDefaults` fills in LLM defaults from
`DEFAULT_CONFIG`, and the user's settings appear as factory
defaults.

`cloud_mesh.networks` ends up empty (the daemon's top-level
`networks` field isn't where the LLM looks), so the network the
user just saved disappears from the saved-networks list.

## Fix

Isolate the daemon's config + updates under a dedicated
subdirectory: `~/.myownllm/.myownmesh/`. The LLM's
`~/.myownllm/config.json` is no longer in the daemon's
`MYOWNMESH_HOME` tree, so the daemon's persist path can't reach
it. Identity, rosters, and signed governance states get moved
into the subdir on first launch — losing identity continuity
would orphan every user's Device ID and force every paired peer
through a fresh approval round, which would be worse than the
bug.

### New file: `src-tauri/src/mesh/migration.rs`

`migrate_daemon_state_into_subdir(llm_dir, daemon_home)`:

- Moves `.secrets/identity.json` — the ed25519 keypair. This
  is the critical one; everything else is recoverable but a new
  pubkey would split the user's mesh across every peer.
- Moves `mesh/rosters/*.json` — per-network approved peers.
  Without these, every paired peer would re-prompt for approval
  on next handshake.
- Moves `mesh/states/*.json` — signed governance-state files
  for closed networks. Required for closed-network identity
  continuity (the founder's signed log can't be regenerated).

Idempotence: checks `identity_dst.exists()` at the top and
no-ops on subsequent runs. Destination collisions (e.g. an
interrupted earlier run) leave both files in place and log to
stderr — the migration doesn't choose between them.

Cross-filesystem moves handled via copy + remove (rare under a
single home dir but cheap to handle correctly).

Four unit tests cover: happy path, idempotence, fresh install
(no source files), destination collision.

### `src-tauri/src/main.rs`

The setup hook runs the new migration before setting
`MYOWNMESH_HOME`, then points the env var at the subdir. Comments
spell out the bug we're fixing so the next person who touches
this path doesn't accidentally regress it.

### `src-tauri/src/mesh/daemon.rs`

- `socket_for_mode(OwnLlm)` now returns
  `~/.myownllm/.myownmesh/daemon.sock` (matching the daemon's
  `data_dir()/daemon.sock` calculation under the new
  `MYOWNMESH_HOME`).
- Spawn path's `home` calculation uses
  `~/.myownllm/.myownmesh/`.
- `Shared` mode untouched — that's the MyOwnMesh GUI's own
  `~/.myownmesh/daemon.sock` and lives outside the LLM's tree.
- Module-level + enum-variant doc updated to match.

### `src/config.ts`

Recovery for users who already hit the bug — their `config.json`
is currently the daemon's shape. `salvageDaemonShapeLeakage`
runs at the top of `mergeDefaults`:

1. Detects daemon shape via the presence of a top-level
   `networks` array whose entries all have `id` + `network_id`,
   AND the absence of populated `cloud_mesh.networks`.
2. Converts each daemon `NetworkConfig` to LLM `NetworkConfig`
   shape (signaling.servers → signaling_servers; stun_servers
   urls flattened; turn_servers' first url + auth lifted). LLM-only
   fields (`accepting`, `agent_permissions`, `prompts`,
   `auto_gossip`) default via `mergeNetwork`.
3. Strips daemon-only top-level keys (`version`,
   `identity_path`, `daemon`, `networks`) so subsequent saves
   are clean LLM shape.

`auto_update` and `auto_cleanup` are shared between both
schemas with compatible field names — left alone; the LLM's
mergeDefaults handles them with its own per-field merge.

No-op when nothing matches the detection signature. Fresh
installs + uncorrupted configs are untouched.

### `src-tauri/Cargo.toml`

Added `tempfile = "3"` under `[dev-dependencies]` for the
migration tests' `tempdir()` fixture.

## Why not patch myownmesh-core to preserve unknown fields?

That would also work, but it requires a MyOwnMesh release +
`.myownmesh-rev` bump and ages slowly through anyone running a
shared daemon that pre-dates the patch. The subdir isolation is
a self-contained fix on the LLM side and prevents the same class
of bug from recurring if the daemon's schema grows fields in the
future.

## Validation

- [x] `pnpm run check`: 164 files, 0 errors, 0 warnings.
- [x] `pnpm run build`: clean.
- [ ] `cargo test --bins -p myownllm` of the new migration
  module's four unit tests. (Sandbox lacks gdk-3.0 so the
  full crate doesn't build here; please verify locally.)
- [ ] Fresh install of the new build, no existing data: app
  launches, no `~/.myownllm/.myownmesh/` until first daemon
  spawn, then it appears with identity + (after first network
  save) `networks/` + `config.json` inside. Parent
  `~/.myownllm/config.json` untouched by the daemon.
- [ ] Upgrade from a pre-fix build with existing identity +
  rosters: `.myownmesh/.secrets/identity.json` and
  `.myownmesh/mesh/rosters/*.json` populated after first
  launch; old locations gone. Same pubkey + same peer
  approvals as before.
- [ ] Upgrade from a pre-fix build that already hit the bug
  (daemon-shape `config.json`): saved networks recovered into
  `cloud_mesh.networks`, daemon-only top-level keys stripped.
  Providers / active_family reset to defaults (irrecoverable —
  the daemon's write destroyed them; the salvage only restores
  what the daemon kept).
- [ ] Save & activate a network: `~/.myownllm/config.json`
  retains full LLM shape; daemon's writes land at
  `~/.myownllm/.myownmesh/config.json` only.
- [ ] Switch active networks: both files update independently;
  LLM's `cloud_mesh.active_network_id` and daemon's
  `networks` list stay in sync via reconcile.

https://claude.ai/code/session_01RLu1LdTgtxEDdzhybzqFrk
CI on macos-14 / ubuntu-latest / windows-latest failed `cargo fmt
--check` against `src-tauri/src/mesh/migration.rs:270` — one
test setup call was over the line-width limit and rustfmt wanted
it across three lines. No semantic change.
@mrjeeves mrjeeves merged commit 1237f8b into main May 28, 2026
4 checks passed
@mrjeeves mrjeeves deleted the claude/charming-turing-PMIK8 branch May 28, 2026 08:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants