📋 Pre-flight Checks
🔍 Problem Description
The current topic_key generation in SuggestTopicKey() has two limitations:
1. Word order sensitivity causes silent duplicates
The segment portion of the key preserves the original word order from the title. This means observations about the same topic can produce different keys depending on phrasing:
"Auth model refactor" → architecture/auth-model-refactor
"Refactor auth model" → architecture/refactor-auth-model
Since topic_key drives upsert logic, these become two separate observations instead of updating the same one. Over time this silently fragments memory, especially with AI agents as primary writers.
2. Flat structure limits organization
The current format is family/segment (e.g., architecture/auth-model). There's no way to group related observations under a domain:
architecture/auth-model and architecture/auth-middleware have no explicit relationship
- Querying "everything about auth architecture" requires FTS, not a simple prefix match
💡 Proposed Solution
A. Sort words alphabetically in segment generation
In normalizeTopicSegment(), sort the words before joining:
words := strings.Fields(strings.ToLower(segment))
sort.Strings(words)
segment = strings.Join(words, "-")
Before: "Auth model refactor" → auth-model-refactor
After: "Auth model refactor" → auth-model-refactor (same)
After: "Refactor auth model" → auth-model-refactor (same! was different before)
This makes upsert matching robust regardless of how the agent phrases the title.
B. Support hierarchical sub-paths (family/domain/segment)
Allow an optional middle level in the key hierarchy:
architecture/auth/model
architecture/auth/middleware
bug/payments/nil-panic
decision/api/versioning-strategy
This enables prefix queries like:
SELECT * FROM observations WHERE topic_key LIKE 'architecture/auth/%'
No schema change required — topic_key is already a TEXT column. This is purely a convention + generation logic change.
📦 Affected Area
Store (database, queries)
🔄 Alternatives Considered
Considered adding a tags column with a junction table for flexible categorization. Discarded because the primary writers are AI agents, which are inconsistent at tagging (e.g., auth vs authentication vs login). Without a human curation loop, tags would add noise rather than improve search. FTS5 + topic_key already covers the search use case without the extra schema complexity.
📎 Additional Context
Recommendation
I recommend implementing both solutions together — they complement each other well:
- A (sorted segments) eliminates silent duplicates at the source
- B (sub-paths) adds the organizational layer that makes prefix queries useful
Implementing them separately would work, but together they deliver a cohesive improvement to topic_key reliability and discoverability.
Volunteer
I'm happy to take this on and submit a PR for both changes. The scope is well-defined and contained.
Impact
- Reduces duplicate observations caused by inconsistent agent phrasing
- Improves memory organization with hierarchical grouping
- Zero schema changes — works within existing TEXT column
- Low implementation effort — core change is ~10 lines in
normalizeTopicSegment() + convention updates
📋 Pre-flight Checks
status:approvedbefore a PR can be opened🔍 Problem Description
The current
topic_keygeneration inSuggestTopicKey()has two limitations:1. Word order sensitivity causes silent duplicates
The segment portion of the key preserves the original word order from the title. This means observations about the same topic can produce different keys depending on phrasing:
"Auth model refactor"→architecture/auth-model-refactor"Refactor auth model"→architecture/refactor-auth-modelSince
topic_keydrives upsert logic, these become two separate observations instead of updating the same one. Over time this silently fragments memory, especially with AI agents as primary writers.2. Flat structure limits organization
The current format is
family/segment(e.g.,architecture/auth-model). There's no way to group related observations under a domain:architecture/auth-modelandarchitecture/auth-middlewarehave no explicit relationship💡 Proposed Solution
A. Sort words alphabetically in segment generation
In
normalizeTopicSegment(), sort the words before joining:Before:
"Auth model refactor"→auth-model-refactorAfter:
"Auth model refactor"→auth-model-refactor(same)After:
"Refactor auth model"→auth-model-refactor(same! was different before)This makes upsert matching robust regardless of how the agent phrases the title.
B. Support hierarchical sub-paths (
family/domain/segment)Allow an optional middle level in the key hierarchy:
This enables prefix queries like:
No schema change required —
topic_keyis already aTEXTcolumn. This is purely a convention + generation logic change.📦 Affected Area
Store (database, queries)
🔄 Alternatives Considered
Considered adding a tags column with a junction table for flexible categorization. Discarded because the primary writers are AI agents, which are inconsistent at tagging (e.g., auth vs authentication vs login). Without a human curation loop, tags would add noise rather than improve search. FTS5 + topic_key already covers the search use case without the extra schema complexity.
📎 Additional Context
Recommendation
I recommend implementing both solutions together — they complement each other well:
Implementing them separately would work, but together they deliver a cohesive improvement to
topic_keyreliability and discoverability.Volunteer
I'm happy to take this on and submit a PR for both changes. The scope is well-defined and contained.
Impact
normalizeTopicSegment()+ convention updates