Node Manager Phase 2b-2a: groupKey + groupKeyMap kinds & per-node mutex reconcile queue#3960
Merged
Merged
Conversation
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…re groupKey capacity Trigger observers fired the async reconcile with a naked `void`, so a rejection from a detached pass (e.g. a capacity command initiating an exchange against a peer torn down mid-flight) surfaced as an unhandled rejection. Route triggers through #fireTrigger, which logs and swallows. This also covers command-based verify. With it, groupKey.capacity() (KeySetReadAllIndices) is safe to restore. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…antics Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d.apply Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A verify pass that detects drift now re-applies in the same pass instead of re-pending, so reconcile(verify) converges deterministically without relying on a follow-up trigger. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the InFlightGuard + fire-and-forget trigger wiring with a per-ClientNode Mutex. Triggers synchronously enqueue a coalesced reconcile request (verify / capacity-refresh flags OR-merge); the mutex serializes passes per node, owns the work, and logs task rejections (no voided or silently swallowed promises). Explicit reconcile() runs via mutex.produce so it is awaitable and never overlaps a triggered pass, making convergence deterministic. A request arriving mid-pass coalesces into exactly one follow-up. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remove the now-unreachable repend ReconcileAction (verify drift applies directly); re-clear pending after mutex close in #unwirePeer; fix a stale test comment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
|
Tick the box to add this pull request to the merge queue (same as
|
…live device read Capacity counts come from the subscription-maintained state (stateOf), not a forced Matter read (getStateOf). Live reads stay only on the RMW path of data we modify (apply/remove), before writing. groupKey drops capacity() entirely: the key-set count has no subscribed attribute (only the KeySetReadAllIndices command), so the device's RESOURCE_EXHAUSTED on KeySetWrite is its over-capacity gate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Sub-PR into
node-manager(umbrella PR #3948 picks it up in CI). First slice of Phase 2b-2 — the command-based group-key kinds, plus a reconcile-concurrency rework.ItemKinds (GroupKeyManagement, root)
groupKey(first command-based kind): provisions key sets viaKeySetWrite/KeySetReadAllIndices/KeySetRemove(peer.commandsOf(GroupKeyManagementClient)).applyalways writes (KeySetWriteis an idempotent overwrite and epoch keys are write-only/null, so key material is unverifiable);verifyis presence-only;capacity=maxGroupKeysPerFabric(spec-floor 3 fallback);groupKeySetId === 0(IPK) →ImplementationError.groupKeyMap(attribute RMW, upsert-by-groupId): a group maps to exactly one key set, soapplyreplaces a differing mapping, no-ops the exact pair, never duplicates agroupId;capacity=maxGroupsPerFabric(floor 4); rejects a group→IPK(0) mapping.Verify semantics
planActions: a verify pass that detects drift now re-applies directly in the same pass (wasrepend), soreconcile(verify)converges deterministically without relying on a follow-up trigger. The now-unreachablerependaction was removed.Concurrency rework (no fire-and-forget)
InFlightGuard+void-ed trigger reconciles with a per-ClientNodeMutex(@matter/general). Sync trigger observers enqueue a coalesced request (verify/refreshCapacityOR-merge) viamutex.run; theMutexowns and serializes the work and logs task rejections — no voided or silently swallowed promises. Explicitreconcile()usesmutex.produceso it is awaitable and never overlaps a triggered pass. A request arriving mid-pass coalesces into exactly one follow-up.Tests
groupKey(apply/verify/remove/capacity/IPK),groupKeyMap(upsert/verify/remove/capacity),planActionsverify-drift→apply, engine hardenings.@matter/node/testing): provision a key set + group-key map → committed + device shows both; remove the key set behind the engine → averifyreconcile re-applies (negative assertion proves it was gone first).build --clean,format-verify,lintgreen;@matter/node-manager61/61;@matter/node1227/1227.Review
Whole-branch review + independent cross-check: merge-ready, zero Critical/Important. Concurrency rework confirmed safe (no-void, coalescing, serialization, no deadlock, no leak);
groupKeyMapIPK-guard judged spec-correct.Scope
2b-2b (next):
endpointGroupMembership(Groups cluster, per-endpointAddGroup/RemoveGroupcommands, verify viaGetGroupMembership).🤖 Generated with Claude Code