mesh: isolate daemon config under ~/.myownllm/.myownmesh/ subdir#208
Merged
Conversation
## Symptom
Saving a network in MyOwnLLM (Settings → Networks → Status →
"+ Add network" → Save & activate) "worked" within a session
but every LLM setting was reset to defaults on the next launch —
providers, active family, all saved networks, accepting policy,
agent permissions, prompts, auto-gossip toggles, everything.
The network the user just saved was also gone from the LLM's UI.
## Root cause
The bundled `myownmesh serve` daemon and the LLM both wrote to
`~/.myownllm/config.json` with different schemas:
- LLM Config: `{providers, active_family, cloud_mesh.networks, ...}`
- daemon's `myownmesh_core::MeshConfig`: `{version, identity_path,
auto_update, auto_cleanup, daemon, networks}` (note: top-level
`networks`, no `cloud_mesh` wrapper)
PR #203's `daemon.rs` set `MYOWNMESH_HOME=~/.myownllm` so the
daemon shared the LLM's directory for identity + rosters. That
sharing worked for those files but pointed the daemon's
`config.json` at the same path as the LLM's, where they have
incompatible schemas.
`MeshConfig::load()` doesn't use `#[serde(deny_unknown_fields)]`,
so the LLM's keys deserialized silently as unknown-and-ignored.
Then any `NetworkAdd` IPC call triggered:
```
MeshConfig::load() // strips every LLM-only key on parse
.networks.push(new)
.save() // writes only MeshConfig's fields back
```
Result: `~/.myownllm/config.json` flips from LLM shape to daemon
shape mid-session, wiping providers / active_family / cloud_mesh /
prompts / permissions / auto_gossip / accepting / etc. The
in-memory `_cached` config keeps the LLM shape so the user sees
no damage until restart; on next load, `loadConfig` reads the
daemon shape, `mergeDefaults` fills in LLM defaults from
`DEFAULT_CONFIG`, and the user's settings appear as factory
defaults.
`cloud_mesh.networks` ends up empty (the daemon's top-level
`networks` field isn't where the LLM looks), so the network the
user just saved disappears from the saved-networks list.
## Fix
Isolate the daemon's config + updates under a dedicated
subdirectory: `~/.myownllm/.myownmesh/`. The LLM's
`~/.myownllm/config.json` is no longer in the daemon's
`MYOWNMESH_HOME` tree, so the daemon's persist path can't reach
it. Identity, rosters, and signed governance states get moved
into the subdir on first launch — losing identity continuity
would orphan every user's Device ID and force every paired peer
through a fresh approval round, which would be worse than the
bug.
### New file: `src-tauri/src/mesh/migration.rs`
`migrate_daemon_state_into_subdir(llm_dir, daemon_home)`:
- Moves `.secrets/identity.json` — the ed25519 keypair. This
is the critical one; everything else is recoverable but a new
pubkey would split the user's mesh across every peer.
- Moves `mesh/rosters/*.json` — per-network approved peers.
Without these, every paired peer would re-prompt for approval
on next handshake.
- Moves `mesh/states/*.json` — signed governance-state files
for closed networks. Required for closed-network identity
continuity (the founder's signed log can't be regenerated).
Idempotence: checks `identity_dst.exists()` at the top and
no-ops on subsequent runs. Destination collisions (e.g. an
interrupted earlier run) leave both files in place and log to
stderr — the migration doesn't choose between them.
Cross-filesystem moves handled via copy + remove (rare under a
single home dir but cheap to handle correctly).
Four unit tests cover: happy path, idempotence, fresh install
(no source files), destination collision.
### `src-tauri/src/main.rs`
The setup hook runs the new migration before setting
`MYOWNMESH_HOME`, then points the env var at the subdir. Comments
spell out the bug we're fixing so the next person who touches
this path doesn't accidentally regress it.
### `src-tauri/src/mesh/daemon.rs`
- `socket_for_mode(OwnLlm)` now returns
`~/.myownllm/.myownmesh/daemon.sock` (matching the daemon's
`data_dir()/daemon.sock` calculation under the new
`MYOWNMESH_HOME`).
- Spawn path's `home` calculation uses
`~/.myownllm/.myownmesh/`.
- `Shared` mode untouched — that's the MyOwnMesh GUI's own
`~/.myownmesh/daemon.sock` and lives outside the LLM's tree.
- Module-level + enum-variant doc updated to match.
### `src/config.ts`
Recovery for users who already hit the bug — their `config.json`
is currently the daemon's shape. `salvageDaemonShapeLeakage`
runs at the top of `mergeDefaults`:
1. Detects daemon shape via the presence of a top-level
`networks` array whose entries all have `id` + `network_id`,
AND the absence of populated `cloud_mesh.networks`.
2. Converts each daemon `NetworkConfig` to LLM `NetworkConfig`
shape (signaling.servers → signaling_servers; stun_servers
urls flattened; turn_servers' first url + auth lifted). LLM-only
fields (`accepting`, `agent_permissions`, `prompts`,
`auto_gossip`) default via `mergeNetwork`.
3. Strips daemon-only top-level keys (`version`,
`identity_path`, `daemon`, `networks`) so subsequent saves
are clean LLM shape.
`auto_update` and `auto_cleanup` are shared between both
schemas with compatible field names — left alone; the LLM's
mergeDefaults handles them with its own per-field merge.
No-op when nothing matches the detection signature. Fresh
installs + uncorrupted configs are untouched.
### `src-tauri/Cargo.toml`
Added `tempfile = "3"` under `[dev-dependencies]` for the
migration tests' `tempdir()` fixture.
## Why not patch myownmesh-core to preserve unknown fields?
That would also work, but it requires a MyOwnMesh release +
`.myownmesh-rev` bump and ages slowly through anyone running a
shared daemon that pre-dates the patch. The subdir isolation is
a self-contained fix on the LLM side and prevents the same class
of bug from recurring if the daemon's schema grows fields in the
future.
## Validation
- [x] `pnpm run check`: 164 files, 0 errors, 0 warnings.
- [x] `pnpm run build`: clean.
- [ ] `cargo test --bins -p myownllm` of the new migration
module's four unit tests. (Sandbox lacks gdk-3.0 so the
full crate doesn't build here; please verify locally.)
- [ ] Fresh install of the new build, no existing data: app
launches, no `~/.myownllm/.myownmesh/` until first daemon
spawn, then it appears with identity + (after first network
save) `networks/` + `config.json` inside. Parent
`~/.myownllm/config.json` untouched by the daemon.
- [ ] Upgrade from a pre-fix build with existing identity +
rosters: `.myownmesh/.secrets/identity.json` and
`.myownmesh/mesh/rosters/*.json` populated after first
launch; old locations gone. Same pubkey + same peer
approvals as before.
- [ ] Upgrade from a pre-fix build that already hit the bug
(daemon-shape `config.json`): saved networks recovered into
`cloud_mesh.networks`, daemon-only top-level keys stripped.
Providers / active_family reset to defaults (irrecoverable —
the daemon's write destroyed them; the salvage only restores
what the daemon kept).
- [ ] Save & activate a network: `~/.myownllm/config.json`
retains full LLM shape; daemon's writes land at
`~/.myownllm/.myownmesh/config.json` only.
- [ ] Switch active networks: both files update independently;
LLM's `cloud_mesh.active_network_id` and daemon's
`networks` list stay in sync via reconcile.
https://claude.ai/code/session_01RLu1LdTgtxEDdzhybzqFrk
CI on macos-14 / ubuntu-latest / windows-latest failed `cargo fmt --check` against `src-tauri/src/mesh/migration.rs:270` — one test setup call was over the line-width limit and rustfmt wanted it across three lines. No semantic change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Symptom
Saving a network in MyOwnLLM (Settings → Networks → Status → "+ Add network" → Save & activate) appeared to work within a session, but after restart every LLM setting was reset to defaults — providers, active family, every saved network, accepting policy, agent permissions, prompts, the auto-gossip toggle, all gone. The network the user just saved also vanished from the LLM's UI.
The user reported it as "is network saving broken?".
Root cause
The bundled
myownmesh servedaemon and the LLM were both writing to~/.myownllm/config.jsonwith different, incompatible schemas:{providers, active_family, cloud_mesh.networks, prompts under networks[*].prompts, agent_permissions under networks[*].agent_permissions, auto_update, ...}myownmesh_core::MeshConfig:{version, identity_path, auto_update, auto_cleanup, daemon, networks}— note: top-levelnetworksarray, nocloud_meshwrapper.PR #203's
daemon.rssetMYOWNMESH_HOME=~/.myownllmso the daemon shared the LLM's directory for identity + rosters. That sharing worked for those files but had the side effect of pointing the daemon's{MYOWNMESH_HOME}/config.jsonat the same path as the LLM's~/.myownllm/config.json— where they have incompatible schemas.MeshConfig::load()doesn't use#[serde(deny_unknown_fields)], so every LLM-only key deserialized as unknown-and-silently-ignored. Then anyNetworkAddIPC call (the user clicking Save & activate) triggered the daemon'spersist_network_add:~/.myownllm/config.jsonwould flip from LLM shape to daemon shape mid-session, wiping providers / active_family / cloud_mesh / prompts / permissions / auto_gossip / accepting. The LLM's in-memory_cachedconfig kept the original LLM shape, so the user saw no damage until next launch. On restart,loadConfigread the daemon shape,mergeDefaultsfilled in LLM defaults fromDEFAULT_CONFIG, and the user's settings appeared as factory defaults.cloud_mesh.networksended up empty (the daemon'snetworksfield is at top level, not insidecloud_mesh), so the user's saved network disappeared from the LLM's saved-networks list.Fix
Isolate the daemon under a dedicated subdirectory:
~/.myownllm/.myownmesh/. The daemon'sconfig.json+updates/live there; the LLM's~/.myownllm/config.jsonis no longer in the daemon'sMYOWNMESH_HOMEtree, so the daemon's persist path can't reach it. Identity, rosters, and signed governance states get moved into the subdir on first launch so existing users keep their pubkey + peer approvals — losing identity continuity would orphan every user's Device ID and force every paired peer through a fresh approval round.New file:
src-tauri/src/mesh/migration.rsmigrate_daemon_state_into_subdir(llm_dir, daemon_home):.secrets/identity.json— the ed25519 keypair. This is the critical one; everything else is recoverable but a new pubkey would split the user's mesh across every peer.mesh/rosters/*.json— per-network approved peers. Without these, every paired peer would re-prompt on next handshake.mesh/states/*.json— signed governance-state files for closed networks.Idempotent (checks
identity_dst.exists()at the top); destination collisions leave both files in place and log to stderr; cross-filesystem moves handled via copy + remove fallback. Four unit tests cover happy path, idempotence, fresh-install no-op, and destination collision.src-tauri/src/main.rsThe setup hook runs the migration before setting
MYOWNMESH_HOME, then points the env var at the subdir. Comments spell out the bug we're fixing so the next person who touches this path doesn't accidentally regress it.src-tauri/src/mesh/daemon.rssocket_for_mode(OwnLlm)now returns~/.myownllm/.myownmesh/daemon.sock(matching the daemon'sdata_dir()/daemon.sockcalculation under the newMYOWNMESH_HOME).homecalculation uses the subdir.Sharedmode untouched — that's the MyOwnMesh GUI's own~/.myownmesh/daemon.sockand lives outside the LLM's tree entirely.src/config.ts— recovery for users who already hit the bugTheir
config.jsonis currently the daemon's shape.salvageDaemonShapeLeakageruns at the top ofmergeDefaults:networksarray whose entries all carryid+network_id, ANDcloud_mesh.networksempty/absent.NetworkConfigto LLMNetworkConfigshape (signaling.servers → signaling_servers;stun_servers[].urlsflattened;turn_servers' first url + auth lifted). LLM-only fields (accepting,agent_permissions,prompts,auto_gossip) default viamergeNetwork.version,identity_path,daemon,networks) so subsequent saves are clean LLM shape.auto_updateandauto_cleanupare shared between both schemas with compatible field names — left alone; the LLM'smergeDefaultshandles them with its own per-field merge. No-op when nothing matches the detection signature — fresh installs + uncorrupted configs are untouched.src-tauri/Cargo.tomlAdded
tempfile = "3"under[dev-dependencies]for the migration tests.Why not patch myownmesh-core to preserve unknown fields?
That would also work but requires a MyOwnMesh release +
.myownmesh-revbump on the LLM side, and ages slowly through any user running a shared daemon that pre-dates the patch. The subdir isolation is a self-contained fix on the LLM side and prevents the same class of bug from recurring if the daemon's schema grows new top-level fields in the future.What's unrecoverable for affected users
The salvage in
config.tsrestores what the daemon kept — the network IDs and transport config. But the daemon'sNetworkConfigdoesn't carry the LLM'saccepting,agent_permissions,prompts, orauto_gossipfields, so those reset to fresh defaults. Providers andactive_familyalso reset because the daemon's write destroyed them. The user's identity (identity.json) and rosters survive untouched — those weren't inconfig.json.Validation
pnpm run check: 164 files, 0 errors, 0 warnings.pnpm run build: clean.cargo test --bins -p myownllmof the new migration module's four unit tests. (Sandbox lacks gdk-3.0 so the full crate doesn't build here; please verify locally.)~/.myownllm/.myownmesh/.secrets/identity.jsonappears,~/.myownllm/config.jsonstays clean LLM shape.~/.myownllm/.secrets/identity.jsonis gone.config.json): saved networks recovered intocloud_mesh.networks, daemon-only top-level keys stripped, providers/active_family reset to defaults (irrecoverable).~/.myownllm/config.jsonretains full LLM shape across the operation; daemon's writes land in~/.myownllm/.myownmesh/config.jsononly.cloud_mesh.active_network_idand the daemon'snetworkslist stay in sync via reconcile.https://claude.ai/code/session_01RLu1LdTgtxEDdzhybzqFrk
Generated by Claude Code