From c9952d80ebabeab8e445ca16c01b473ea5ec9cd9 Mon Sep 17 00:00:00 2001 From: Jordan Burger Date: Tue, 23 Jun 2026 14:02:13 -0400 Subject: [PATCH] feat(kb-management): name/acronym gate + entity-creation trigger + staleness observability MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three rules upstreamed from accumulated instance experience into phases/core/kb-management.md (assembles into SKILL + DREAMING, not RESEARCH): - Never Guess a Name or Acronym Expansion — cite the full form from a primary source or leave the token abbreviated and [unverified]; never invent a plausible expansion (it reads as fact and propagates). - Entity-creation trigger — mint a new entity file when a person, org, or technology recurs across 3+ independent sources. Enriching existing files is not a substitute for creating the missing ones. - Staleness observability + search depth — every KB file carries a machine-readable last_updated: so a scan can rank and a driver can queue stale files (opportunistic-only refresh lets the long tail rot); and don't scope completeness searches to a fixed channel set / from:me — a known-to-exist fact the scan missed is a search-depth miss, not absence. Both rules are generic; originating examples (incl. a customer name) were dropped. Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 3 +++ phases/core/kb-management.md | 14 ++++++++++++++ 2 files changed, 17 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index ee62a93..09f70bb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,9 @@ this project adheres to [Semantic Versioning](https://semver.org/). ## [Unreleased] +### Added +- **KB-management hardening** (`phases/core/kb-management.md`) — three rules upstreamed from accumulated instance experience: **Never Guess a Name or Acronym Expansion** (cite the full form from a primary source or leave it abbreviated `[unverified]` — never invent a plausible expansion, which reads as fact and propagates); an **entity-creation trigger** (mint a new entity file when a person/org/technology recurs across 3+ independent sources — enriching existing files is not a substitute for creating missing ones); and **staleness-observability + search-depth** freshness rules (every file carries a machine-readable `last_updated:` so a scan can rank and a driver can queue stale files — opportunistic-only refresh lets the long tail rot; and don't scope completeness searches to a fixed channel set / `from:me` filters — a known-to-exist fact your scan missed is a search-depth miss, not an absence). + ## [0.7.2] - 2026-06-22 diff --git a/phases/core/kb-management.md b/phases/core/kb-management.md index c5a3337..fda85d0 100644 --- a/phases/core/kb-management.md +++ b/phases/core/kb-management.md @@ -75,6 +75,12 @@ The KB is the **persistent memory** of this system. Action items are ephemeral ( 4. Add any new people to `people.md` 5. Add any new issues to the issue tracker file +**Create a new entity file (`knowledge-base/people/`, `ontology/entities/`) when:** +- A person, organization, or technology **recurs across 3+ independent sources** (e.g. named in two meetings and a message thread, or an issue + a PR + a calendar invite). Recurrence at that threshold means it's a real entity worth tracking, not a one-off mention — the trigger is the recurrence itself; don't wait to be told. +- It has relationships worth tracking (who it works with, what it depends on, which projects it touches). + +Enriching the files you already have is **not** a substitute for minting the ones you're missing: a run that goes deep on existing entities but never creates one for a person/org/tech that has clearly crossed the recurrence threshold leaves a structural gap in the graph. + **Do NOT create a new file when:** - A topic is just a sub-item of an existing project (add it to that project's file instead) - It's a one-off task with no ongoing context (that's an action item, not a KB entry) @@ -105,6 +111,10 @@ Every KB file should have a "Last updated" or "Last verified" line. The standard During consolidation KB audits, **prioritize the stalest high-priority files** when choosing what to audit. +**Make staleness observable, don't just assert it.** The table above is only enforceable if each file's age is machine-readable: every KB file should carry a `last_updated:` property (and, where possible, its latest-commit date) — not just a prose "Last updated" line — so a scan can rank files by staleness and a refresh driver can queue the over-threshold ones. Freshness enforced *opportunistically* — a file refreshed only when a run happens to touch its project — lets the long tail rot: files with no recent connector activity never get picked, and nothing in the system ever *sees* them aging. + +**Widen discovery beyond a fixed net.** A material fact can land in an unwatched channel, an off-keyword phrasing, or a source you don't routinely scan — so don't scope KB-completeness searches to a fixed channel set or to `from:me`-style filters alone. When a fact is known to exist (referenced in a meeting, a message, or by {{USER_NAME}}) but your scan didn't surface it, treat that as a *search-depth miss*, not an absence: broaden the query (other channels, both directions, alternate terms) until you find it. + ### Review Queue — `knowledge-base/review-queue.md` **When you are uncertain about something, DO NOT write it to the KB. Put it in the review queue instead.** {{USER_NAME}} will verify it and either approve it into the KB or reject it. @@ -143,6 +153,10 @@ When writing KB content, use these markers: - **[contradicted]** = two sources disagree; both claims noted with sources cited — always add to review queue - **[speculative]** = an inferred causal/contributory link that no single source actually states; allowed only with this marker (see Causal-Claim Gate) +### Never Guess a Name or Acronym Expansion + +When you encounter an acronym, an abbreviation, or a partial/initialled name, **never expand it from a guess** — a plausible-but-wrong expansion reads as fact and propagates across the KB. Cite the **full form from a primary source** (the sender's signature, an org's own site or docs, an issue/PR body, a calendar invite, the person's own message), or leave the token in its original abbreviated form marked `[unverified]`. The failure mode is inventing a confident full name for an acronym that actually stands for something else entirely. This applies equally to organization names, product/codenames, team names, and people's full names. If {{USER_NAME}} corrects an expansion, record the correct full form so it resolves next time. + ### Causal-Claim Gate Any statement asserting that one issue / PR / metric / event **caused, contributed to, blocked, or drove** another is a high-risk inference — it reads as fact but is usually the run's own synthesis. Before writing such a cross-entity causal claim (in the KB or a DM):