Claude/fix glasswally agent glitch by noisyloop · Pull Request #38 · noisyloop/EverythingOS

noisyloop · 2026-05-18T18:51:25Z

No description provided.

Behavioral fingerprint probes lived only in source, so an adversary reading the repo could craft a model that passes exactly them. Adds two controls, both pinned per-baseline so drift detection stays comparable and existing baselines need no migration: - MODEL_GUARD_PROBES_FILE: a validated out-of-band probe set that fully replaces the built-ins. Fails closed in production if unloadable — never silently degrades to the publicly-known probes. - MODEL_GUARD_PROBE_COUNT: fingerprint with a crypto-random k-of-n probe subset; the selection is recorded in the baseline and reused on every subsequent check. A baseline whose pinned probes are absent from the active pool fails closed rather than emitting an incomparable fingerprint. https://claude.ai/code/session_01ArAvRMiZgCwF5oNj3r94Ap

The swarm mesh let any node on the segment inject as a peer (discovery announcements were entirely unauthenticated), and the message HMAC covered only id:from:to:timestamp — not type or payload — so a captured signed message could be tampered and still verify. There was no replay defense. With meshSecret set: - Discovery announcements must carry a fresh, non-replayed HMAC proof of the deployment secret (domain-separated from message signatures); unauthenticated/forged/stale/replayed enrollments are rejected and surfaced via mesh:peer:rejected. - Message signatures now cover a canonical envelope including type and a hash of payload, removing the ':'-join ambiguity. - Per-message and per-announcement nonces with a freshness window give bounded replay protection. Backward compatible: without meshSecret the mesh stays open for dev but logs a loud warning. Full X.509 mTLS remains the documented ideal. https://claude.ai/code/session_01ArAvRMiZgCwF5oNj3r94Ap

…m memory Long-term memory ranked purely by keyword overlap with no notion of trust, so crafted content could score top relevance on legitimate queries and be injected into prompts (only output was sanitized, after retrieval). - Store-time injection detection: content with injection patterns is written with low trust and flagged (not refused), emitting memory:longterm:poisoning_suspected. - Trust-weighted retrieval: relevance is multiplied by per-entry trust; flagged or sub-floor entries are excluded. Centralized in LongTermMemory.search so it covers keyword and semantic paths; legacy entries without trust default above the floor (back-compat). - Bounded keyword-stuffing breadth heuristic: an entry that near-exactly matches more than N distinct queries is flagged and excluded, emitting memory:longterm:poisoning_detected. Security trust/flag fields cannot be overridden via caller metadata. Semantic provenance scoring remains the documented open problem. https://claude.ai/code/session_01ArAvRMiZgCwF5oNj3r94Ap

API keys lived in process.env readable by any in-process code including plugins. Adds a sealed in-memory secrets provider: at finalizeStartup (prod-gated via NODE_ENV=production or EOS_SEAL_SECRETS=1) credentials are captured and deleted from process.env, then served only through the gated getSecret()/requireSecret(). Non-sealed keys fall through to the prior provider; dev/test are unaffected (no-op unless enabled). LLM provider constructors, the Glasswally IOC secret, and the Discord LLM fallbacks now resolve via getSecret() so they keep working after the environment is sealed (HMAC secrets are already captured into module consts at import, before finalization). External secrets manager / HSM remains the documented ideal. https://claude.ai/code/session_01ArAvRMiZgCwF5oNj3r94Ap

README Known Limitations now reflects reality: formal threat model exists (docs/STRIDE.md, resolved); single-process honestly described (HIGH-tier already worker-thread isolated, residual gap noted); Glasswally bullet notes the D-6 line-buffer DoS is now actually enforced. The mesh/credentials/probes/memory bullets were updated alongside their fixes in prior commits. Also isolates AUDIT_LOG_PATH / AGENT_REVOCATION_LOG / DECISION_LEDGER_PATH / MODEL_GUARD_DIR per jest worker via a setupFiles script. Parallel workers previously appended to the same on-disk audit log, intermittently corrupting the hash chain that the e2e suite verifies. Per-worker temp paths remove the race without serializing tests. No production code change. https://claude.ai/code/session_01ArAvRMiZgCwF5oNj3r94Ap

claude added 5 commits May 18, 2026 03:05

noisyloop merged commit 066aa79 into main May 18, 2026

noisyloop mentioned this pull request May 18, 2026

security: STRIDE verification audit — fixes (E-5, T-5, T-4), truth-up, claim/evidence CI gate #39

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Claude/fix glasswally agent glitch#38

Claude/fix glasswally agent glitch#38
noisyloop merged 5 commits into
mainfrom
claude/fix-glasswally-agent-UVhSV

noisyloop commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

noisyloop commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants