diff --git a/BRAINDB_GUIDE.md b/BRAINDB_GUIDE.md index 7a3e95c..4837a8c 100644 --- a/BRAINDB_GUIDE.md +++ b/BRAINDB_GUIDE.md @@ -5,6 +5,29 @@ The API runs at **http://localhost:8000**. Everything is done via HTTP calls. --- +## ⚠ TOOL PRIORITY (read this first) + +BrainDB's value is the graph + embeddings + ranking. Use that power; do not +fall back to flat SQL. + +1. **`POST /api/v1/memory/context`** — default for **query-driven** recall, + discovery, understanding ("what do we know about X?"). Keyword-mediated + fuzzy + embeddings + graph + ranking. +2. **`GET /api/v1/memory/tree/?max_depth=N`** — reveals an entity's + neighbourhood as a nested JSON tree (root keyed by `entity_type`, + `children` arrays per node, keyword + retired-wiki noise filtered, + `_truncated` last-child marker if more remain). Especially useful when + you have an entity ID and want its graph context. On hub entities + (wikis with many connections) pass `max_depth=3` for narrative chains. +3. **`POST /api/v1/agent/query`** ("delegate to a subagent") — for multi-step + investigation / disambiguation. +4. `GET /api/v1/entities…` and `/entities//relations` — direct lookups. +5. **`POST /api/v1/memory/sql` ⚠ exception only** — aggregates (counts, + GROUP BY, activity-log joins). NEVER for recall, discovery, similarity, + understanding, or "what's around this entity" — those are the tools above. + +--- + ## Entity Types | Type | What to store | @@ -270,7 +293,12 @@ curl -X POST http://localhost:8000/api/v1/memory/context \ ```bash curl http://localhost:8000/api/v1/memory/tree/?max_depth=2 ``` -Returns the entity and all its graph connections organized by depth and relevance. +Returns the entity and its 1-N hop neighbourhood as a nested JSON tree (root +keyed by `entity_type`, `children` arrays per node, multi-path first-wins by +best accumulated path score, keyword + retired-wiki noise filtered by default). +Optional query params: `include_keywords` (default `false`), `top_k` (default +`40`), `min_path_score` (default `0.0`). A `_truncated` last-child marker +appears when `top_k` clips the result. ### Get All Rules ```bash diff --git a/CHANGELOG.md b/CHANGELOG.md index 5c97dc3..0a149fe 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,79 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.4.0] — 2026-06-03 + +Headline: a focused pass on recall quality and the `view_tree` tool. The +per-edge LLM judgment that was missing on `create_relation` is now wired +through to graph scoring, and `view_tree` returns a nested JSON tree the +agent can actually navigate (vs the depth-grouped text that silently +clipped 70% of connections on popular wikis). + +### Changed + +- **`view_tree` / `GET /api/v1/memory/tree/` — nested JSON shape.** + Root keyed by `entity_type`, `children` arrays per node, multi-path + first-wins by best accumulated path score, keyword + retired-wiki noise + filtered by default, `_truncated` last-child marker when more remain. + One shared builder (`build_entity_tree` in `braindb/services/tree.py`) + for the HTTP endpoint and the agent tool — no behaviour drift. New + optional query params: `include_keywords` (default `false`), `top_k` + (default `40`), `min_path_score` (default `0.0`). Bench: Path B (Qwen + 27B) 5/5 PASS, **−25% wall-clock**, **−26% tool calls**, **zero + `delegate` calls** on the hardest question (was 2). Path A (Claude + Code + curl skill) 5/5 PASS, `view_tree` usage 0 → 1-2 calls — the + structured shape is now usable in practice. Numbers in + `benchmarks/runs/round-2f_comparison.md`. +- **`create_relation` writes both edge scores.** The `importance_score` + column had been NULL for every agent-created row since day one; the + parameter is now on the tool, the watcher's extraction prompt no + longer dictates literal score values (the LLM judges per docstring), + and the graph CTE multiplies `relevance_score × COALESCE( + importance_score, 0.5) × depth_penalty` per hop. The + `is_bidirectional` field on `relations` is now ignored by graph + traversal — every edge walks both ways (matching what + `services/tree.py` already did). The field stays in the schema for + backwards compatibility. +- **Seed-similarity propagation in graph scoring.** Recall hops carry + the seed's similarity score forward through the graph; the depth + multiplier is softened so default-quality intermediates don't collapse + depth-2/3 results. +- **Prompts**: `view_tree` reframed as a capability for "explore around + this entity"; `search_sql` demoted to an aggregates-only exception. + Same wording applied to `system_prompt.md`, both skill files + (`skills/braindb/SKILL.md`, `skills/braindb-agent/SKILL.md`), + `README.md`, and `BRAINDB_GUIDE.md`. + +### Fixed + +- **`view_tree` keyword noise**: keyword entities were leaking through + non-`tagged_with` edges (e.g. `similar_to` between keywords). The + filter now applies to target `entity_type='keyword'` as well as the + edge type. Surfaced by the new nested JSON shape and fixed in the + same commit. +- **`view_tree` duplicate retired-wiki siblings**: when the wiki + maintainer creates a wiki and later consolidates duplicates, the + retired ones used to still appear in `view_tree` output as duplicate + siblings of the canonical wiki. The tree CTE now LEFT JOINs + `wikis_ext` and skips rows where `retired_at IS NOT NULL`. +- **Test isolation in `tests/test_ingest.py`**: the three ingest tests + used fixed content strings, so a previous run's row in the DB caused + dedup-by-hash to fire and the 201 assertion to fail on subsequent + runs. Each test now prepends a per-run `uuid.uuid4().hex` to its + content. +- **Missing test dep**: added `pytest-asyncio==0.23.7` to + `[project.optional-dependencies].dev`. Existing + `@pytest.mark.asyncio` tests were silently failing on clean installs. + +### Upgrading from v0.3.0 + +No DB migration. No env-var changes. The wiki maintainer's existing +retired-wiki pipeline now also gates `view_tree` traversal — old wikis +with `retired_at IS NOT NULL` are silently skipped in tree output (they +remain readable via `GET /api/v1/entities/`). `pyproject.toml` +version field was at `0.2.0` in v0.3.0's tagged release (the bump was +missed); this release catches it up to `0.4.0` in one step. + ## [0.3.0] — 2026-05-25 Headline: a small read-only frontend lands so humans can browse the same diff --git a/README.md b/README.md index eb54ab3..d664d42 100644 --- a/README.md +++ b/README.md @@ -134,7 +134,7 @@ Drop a markdown file into `data/sources/` and the watcher sidecar picks it up wi | POST | `/api/v1/entities/datasources/ingest` | Read a file from disk and create a datasource entity | | POST | `/api/v1/memory/search` | Fast fuzzy search | | POST | `/api/v1/memory/context` | Full retrieval: fuzzy → graph → decay → rank | -| GET | `/api/v1/memory/tree/{id}` | Entity graph tree — connections by depth | +| GET | `/api/v1/memory/tree/{id}` | Entity neighbourhood as a nested JSON tree | | GET | `/api/v1/memory/log` | Activity log — when and how things happened | | POST | `/api/v1/memory/sql` | Read-only SQL queries (SELECT/WITH only) | | POST | `/api/v1/memory/generate-embeddings` | Batch-generate keyword embeddings | diff --git a/braindb/agent/prompts/system_prompt.md b/braindb/agent/prompts/system_prompt.md index 335bbd0..7d289d3 100644 --- a/braindb/agent/prompts/system_prompt.md +++ b/braindb/agent/prompts/system_prompt.md @@ -26,7 +26,7 @@ CRITICAL — every assistant message MUST be a tool call; never plain prose. The - `delete_entity(entity_id)` **Relations:** -- `create_relation(from_entity_id, to_entity_id, relation_type, relevance_score, description)` +- `create_relation(from_entity_id, to_entity_id, relation_type, relevance_score, importance_score, description)` - `view_entity_relations(entity_id)` - `delete_relation(relation_id)` @@ -50,24 +50,30 @@ CRITICAL — every assistant message MUST be a tool call; never plain prose. The BrainDB's value is the graph + embeddings + ranking. Use that power; do not fall back to flat SQL. -1. **`recall_memory`** — the default for ALL recall, discovery, and - understanding: multi-query fuzzy + full-text + **keyword-embedding** + - graph traversal + decay + ranking. This is almost always the right first - call. -2. **`delegate_to_subagent`** — for any multi-step investigation or +1. **`recall_memory`** — default for ALL **query-driven** recall, discovery, + and understanding: multi-query fuzzy + full-text + **keyword-embedding** + + graph traversal + decay + ranking. Use first when you don't yet know which + specific entity you want — "what do we know about X." +2. **`view_tree(, max_depth=N)`** — the efficient "explore around this + entity" tool. Returns a nested JSON tree: root keyed by entity_type, + `children` arrays per node, 1-N hops out, keyword/retired-wiki noise + filtered, `_truncated` marker if more remain. When you have an entity + ID and want what's connected, this is usually a better next step than + another `recall_memory`. On hub entities (wikis), pass `max_depth=3` + to see narrative chains. +3. **`delegate_to_subagent`** — for any multi-step investigation or disambiguation ("is this the same person/thing?", "find and resolve X"). A fresh agent with the full toolset; returns a summary. Prefer this over doing a long crawl yourself. -3. `view_tree` / `view_entity_relations` / `get_entity` / `list_entities` — - targeted structure lookups. -4. **`search_sql` — exception only.** A blunt SELECT has no embeddings, no - graph, no ranking — it throws away everything BrainDB is good at. Use it - *only* for a specific structured/aggregate question the tools above cannot - express (counts, GROUP BY, activity-log joins). Never for recall, - discovery, similarity, or understanding. +4. `view_entity_relations` / `get_entity` / `list_entities` — direct lookups + (single-hop relations, full body of one entity, listing by filter). +5. **`search_sql` ⚠ exception only — aggregates only** (counts, GROUP BY, + activity-log joins, schema inspection). A blunt SELECT has no embeddings, + no graph, no ranking — it throws away everything BrainDB is good at. + Never for recall, discovery, similarity, or understanding. If you reach for `search_sql` to "find" or "understand" something, stop — -that's a `recall_memory` or `delegate_to_subagent` job. +that's a `recall_memory` or `view_tree` or `delegate_to_subagent` job. ## READING CONTENT — previews vs the full body diff --git a/braindb/agent/tools.py b/braindb/agent/tools.py index 3628c12..b2fca19 100644 --- a/braindb/agent/tools.py +++ b/braindb/agent/tools.py @@ -112,10 +112,10 @@ async def recall_memory( queries: list[str], max_results: int = settings.recall_default_max_results, ) -> str: - """Search BrainDB memory with multiple natural language queries. + """⭐ Primary recall tool — use FIRST for ANY "what do we know about X" question. + Runs fuzzy + fulltext + keyword embedding search, merges with geometric mean, traverses the graph up to 3 hops, applies temporal decay. - Use this as the primary recall tool. QUERY STRATEGY — IMPORTANT for high-recall on narrow subjects: @@ -542,7 +542,8 @@ async def create_relation( from_entity_id: str, to_entity_id: str, relation_type: str, - relevance_score: float = 0.7, + relevance_score: float = 0.5, + importance_score: float = 0.5, description: Optional[str] = None, ) -> str: """Create a relation between two entities. @@ -551,7 +552,8 @@ async def create_relation( from_entity_id: Source entity UUID. to_entity_id: Target entity UUID. relation_type: One of: supports, contradicts, elaborates, refers_to, derived_from, similar_to, is_example_of, challenges, tagged_with. - relevance_score: 0-1 (default 0.7). + relevance_score: 0-1 — how tight the semantic link is (0.9 = strong, 0.5 = neutral / "didn't judge", 0.2 = weak). + importance_score: 0-1 — how much losing this edge would degrade recall (0.9 = critical, 0.5 = neutral / "didn't judge", 0.2 = trivial). description: Why this relation exists. """ try: @@ -559,9 +561,9 @@ async def create_relation( with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as cur: try: cur.execute( - """INSERT INTO relations (from_entity_id, to_entity_id, relation_type, relevance_score, description) - VALUES (%s, %s, %s, %s, %s) RETURNING id""", - (from_entity_id, to_entity_id, relation_type, relevance_score, description), + """INSERT INTO relations (from_entity_id, to_entity_id, relation_type, relevance_score, importance_score, description) + VALUES (%s, %s, %s, %s, %s, %s) RETURNING id""", + (from_entity_id, to_entity_id, relation_type, relevance_score, importance_score, description), ) rid = cur.fetchone()["id"] except psycopg2.errors.UniqueViolation: @@ -634,33 +636,27 @@ async def delete_relation(relation_id: str) -> str: @function_tool @_verbose("view_tree") async def view_tree(entity_id: str, max_depth: int = 2) -> str: - """View the graph of connections around an entity (incoming + outgoing). + """⭐ Reveals an entity's neighbourhood as a nested JSON tree: + root keyed by ``entity_type``, ``children`` arrays per node, multi-path + first-wins, keyword/retired-wiki noise filtered, ``_truncated`` marker + when more remain. Especially useful when you have an entity ID (from a + previous result) and want its graph context — often a sharper choice + than another `recall_memory` about the same entity. Pass `max_depth=3` + on hub entities (wikis with many connections) to see narrative chains. Args: entity_id: UUID of the root entity. max_depth: How far to traverse (1-3, default 2). """ + if max_depth < 1 or max_depth > 3: + return _err("max_depth must be 1-3") try: + from braindb.services.tree import build_entity_tree with get_conn() as conn: - with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as cur: - cur.execute( - """SELECT e.*, r.relation_type, r.relevance_score, r.description AS rel_desc, - CASE WHEN r.from_entity_id = %s THEN 'out' ELSE 'in' END AS dir - FROM relations r - JOIN entities e ON e.id = CASE WHEN r.from_entity_id = %s THEN r.to_entity_id ELSE r.from_entity_id END - WHERE r.from_entity_id = %s OR r.to_entity_id = %s""", - (entity_id, entity_id, entity_id, entity_id), - ) - rows = [dict(r) for r in cur.fetchall()] - if not rows: - return "No connections." - lines = [f"{len(rows)} connections from {entity_id}:"] - for r in rows: - lines.append( - f" [{r['dir']}] {r['relation_type']} (rel={r['relevance_score']})\n" - f" [{r['entity_type']}] {r['content'][:80]} (id: {r['id']})" - ) - return _truncate("\n".join(lines)) + tree = build_entity_tree(conn, entity_id, max_depth=max_depth) + if tree is None: + return _err(f"Entity not found: {entity_id}") + return _truncate(json.dumps(tree, indent=2, default=str, ensure_ascii=False)) except Exception as e: return _err(str(e)) @@ -668,7 +664,10 @@ async def view_tree(entity_id: str, max_depth: int = 2) -> str: @function_tool @_verbose("search_sql") async def search_sql(query: str) -> str: - """Run a read-only SQL query (SELECT/WITH only). Use for complex exploration. + """⚠ Aggregates ONLY (counts, GROUP BY, joins for stats). NEVER for recall / + discovery / understanding — that's recall_memory. NEVER for "what's around + this entity" — that's view_tree. If you're using SQL to find or understand + something, stop and pick the right tool. Args: query: SQL query — must start with SELECT or WITH. diff --git a/braindb/ingest_watcher.py b/braindb/ingest_watcher.py index 4a159e1..cb3fea2 100644 --- a/braindb/ingest_watcher.py +++ b/braindb/ingest_watcher.py @@ -46,7 +46,7 @@ CHUNK_OVERLAP = 75 # words of overlap between adjacent chunks — catches facts that span a boundary INGEST_TIMEOUT = 60 -AGENT_TIMEOUT = 600 # NIM free tier is slow/flaky on gemma-4-31b; generous timeout gives retries room to succeed +AGENT_TIMEOUT = int(os.getenv("AGENT_TIMEOUT", "1200")) # env-overridable; matches wiki_scheduler pattern. Generous default for slow / deep-exploring LLM runs. UUID_RE = re.compile(r"[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}") logging.basicConfig( @@ -157,13 +157,15 @@ def extract_facts_from_chunk(ds_id: str, title: str, idx: int, total: int, chunk f"numbers, events, named decisions. Ignore filler, opinion, and generic\n" f"statements. Aim for quality over quantity: typically 3-8 facts per chunk.\n\n" f"For EACH fact:\n" - f' a) Call save_fact(content="", certainty=0.8,\n' - f' source="document", keywords=[2-4 precise tags], importance=0.6,\n' - f' notes="Extracted from {title} chunk {idx}/{total}"). Record the\n' - f" returned fact id.\n" + f' a) Call save_fact(content="",\n' + f' source="document", keywords=[2-4 precise tags],\n' + f' notes="Extracted from {title} chunk {idx}/{total}"). Judge\n' + f" certainty and importance per the tool's docstring guidance.\n" + f" Record the returned fact id.\n" f" b) Call create_relation(from_entity_id=, " f'to_entity_id="{ds_id}", relation_type="derived_from",\n' - f' relevance_score=0.9, description="Fact extracted from {title}").\n\n' + f' description="Fact extracted from {title}"). Judge\n' + f" relevance_score and importance_score per the tool's docstring.\n\n" f"Do NOT call get_entity. Do NOT call update_entity on the datasource.\n" f"Do NOT touch the datasource content — it is read-only.\n\n" f"When all facts in this chunk are processed, call final_answer with\n" @@ -210,7 +212,8 @@ def central_review(ds_id: str, title: str, fact_ids: list[str]) -> None: f" or is_example_of. Only create relations that are genuinely meaningful.\n" f"2. If the set as a whole suggests a broader observation or inference that\n" f" none of the individual facts capture, call save_thought with that\n" - f" thought (certainty=0.6-0.8, source='agent-inference') and then\n" + f" thought (source='agent-inference'; judge certainty + importance per\n" + f" the tool docstring) and then\n" f' create_relation(thought_id, "{ds_id}", "elaborates").\n' f"3. Optionally run recall_memory with 1-2 queries derived from the facts\n" f" to find related EXISTING entities in memory. If any are clearly\n" diff --git a/braindb/routers/memory.py b/braindb/routers/memory.py index c680d39..8638483 100644 --- a/braindb/routers/memory.py +++ b/braindb/routers/memory.py @@ -79,68 +79,36 @@ def get_rules(): @router.get("/tree/{entity_id}") -def entity_tree(entity_id: UUID, max_depth: int = Query(default=2, ge=1, le=3)): - """Return an entity and its graph connections organized by depth.""" +def entity_tree( + entity_id: UUID, + max_depth: int = Query(default=2, ge=1, le=3), + include_keywords: bool = Query(default=False), + top_k: int = Query(default=40, ge=1, le=500), + min_path_score: float = Query(default=0.0, ge=0.0, le=1.0), +): + """Return an entity and its graph neighbourhood as a nested JSON tree. + + Round-2f shape: root keyed by ``entity_type``; ``children`` array of + typed nodes (each keyed by its own ``entity_type`` and labelled to + its type — wiki=title, fact/thought=content, source=filename, etc.); + multi-path first-wins by accumulated path score; ``tagged_with`` + keyword edges skipped by default (root's ``keywords`` array is the + one-liner instead). Top-``top_k`` connections are kept; the rest are + summarised with a single ``_truncated`` marker. + """ + from braindb.services.tree import build_entity_tree with get_conn() as conn: - # Fetch root entity - with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as cur: - cur.execute("SELECT * FROM entities WHERE id = %s", (str(entity_id),)) - root_row = cur.fetchone() - if not root_row: - raise HTTPException(404, "Entity not found") - root_row = dict(root_row) - - # Fetch root extension fields - root_ext = fetch_ext(conn, [root_row]) - root_data = { - "id": root_row["id"], "entity_type": root_row["entity_type"], - "title": root_row.get("title"), "content": root_row["content"], - "keywords": root_row.get("keywords") or [], - "importance": root_row["importance"], - "notes": root_row.get("notes"), - "ext": root_ext.get(root_row["id"], {}), - } - - # Find all connections via relations (both directions) - with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as cur: - cur.execute(""" - SELECT e.*, r.relation_type, r.relevance_score, r.description AS rel_description, - CASE WHEN r.from_entity_id = %s THEN 'outgoing' ELSE 'incoming' END AS direction - FROM relations r - JOIN entities e ON e.id = CASE - WHEN r.from_entity_id = %s THEN r.to_entity_id - ELSE r.from_entity_id - END - WHERE r.from_entity_id = %s OR r.to_entity_id = %s - """, (str(entity_id), str(entity_id), str(entity_id), str(entity_id))) - direct_rows = [dict(r) for r in cur.fetchall()] - - conn_ext = fetch_ext(conn, direct_rows) if direct_rows else {} - - connections = [] - seen_ids = {entity_id} - for row in direct_rows: - eid = row["id"] - if eid in seen_ids: - continue - seen_ids.add(eid) - connections.append({ - "entity": { - "id": eid, "entity_type": row["entity_type"], - "title": row.get("title"), "content": row["content"], - "keywords": row.get("keywords") or [], - "importance": row["importance"], - "ext": conn_ext.get(eid, {}), - }, - "depth": 1, - "relevance": row.get("relevance_score", 1.0), - "via_relation_type": row.get("relation_type"), - "via_description": row.get("rel_description"), - "direction": row.get("direction"), - }) - - connections.sort(key=lambda c: (-c["relevance"], c["depth"])) - return {"root": root_data, "connections": connections} + tree = build_entity_tree( + conn, + str(entity_id), + max_depth=max_depth, + include_keywords=include_keywords, + top_k=top_k, + min_path_score=min_path_score, + ) + if tree is None: + raise HTTPException(404, "Entity not found") + return tree @router.get("/log") diff --git a/braindb/services/context.py b/braindb/services/context.py index 383ef88..f9abd0f 100644 --- a/braindb/services/context.py +++ b/braindb/services/context.py @@ -397,7 +397,14 @@ def assemble_context(conn, req: ContextRequest) -> ContextResponse: items = [] for row in all_rows: eid = row["id"] - score = seed_scores.get(eid, 0.3) + # Score = the entity's own similarity if it was a seed; otherwise inherit + # the score of the seed it descended from (carried by `seed_origin_id` + # through the graph CTE). This propagates the real similarity signal + # through depth-1+ hops instead of resetting to a literal fallback. + score = seed_scores.get(eid) + if score is None: + origin = row.get("seed_origin_id") + score = seed_scores.get(str(origin), 1.0) if origin else 1.0 depth = row.get("min_depth", 0) relevance = row.get("relevance", 1.0) items.append(_to_item(row, score, depth, relevance, ext_map.get(eid, {}))) diff --git a/braindb/services/graph.py b/braindb/services/graph.py index ecc4346..bcd1eaa 100644 --- a/braindb/services/graph.py +++ b/braindb/services/graph.py @@ -16,7 +16,11 @@ NULL::UUID AS via_relation_id, NULL::TEXT AS via_relation_type, NULL::TEXT AS via_description, - NULL::TEXT AS via_notes + NULL::TEXT AS via_notes, + -- The seed each row descended from. For seeds, it's themselves; + -- for graph-discovered rows, it propagates through the recursion + -- so context.py can inherit the seed's similarity score. + e.id AS seed_origin_id FROM entities e WHERE e.id = ANY(%s::uuid[]) @@ -30,22 +34,21 @@ ( t.accumulated_relevance * r.relevance_score + * COALESCE(r.importance_score, 0.5) * CASE t.depth + 1 WHEN 1 THEN 1.0 - WHEN 2 THEN 0.6 - ELSE 0.3 + WHEN 2 THEN 0.8 + ELSE 0.6 END )::FLOAT, t.visited || target.id, r.id, r.relation_type, r.description, - r.notes + r.notes, + t.seed_origin_id FROM traversal t - JOIN relations r ON ( - r.from_entity_id = t.id - OR (r.is_bidirectional AND r.to_entity_id = t.id) - ) + JOIN relations r ON r.from_entity_id = t.id OR r.to_entity_id = t.id JOIN entities target ON target.id = CASE WHEN r.from_entity_id = t.id THEN r.to_entity_id @@ -56,7 +59,8 @@ AND ( t.accumulated_relevance * r.relevance_score - * CASE t.depth + 1 WHEN 1 THEN 1.0 WHEN 2 THEN 0.6 ELSE 0.3 END + * COALESCE(r.importance_score, 0.5) + * CASE t.depth + 1 WHEN 1 THEN 1.0 WHEN 2 THEN 0.8 ELSE 0.6 END ) > %s ) SELECT DISTINCT ON (id) @@ -65,7 +69,8 @@ created_at, updated_at, accessed_at, access_count, metadata, depth AS min_depth, accumulated_relevance AS relevance, - via_relation_id, via_relation_type, via_description, via_notes + via_relation_id, via_relation_type, via_description, via_notes, + seed_origin_id FROM traversal ORDER BY id, depth, accumulated_relevance DESC """ diff --git a/braindb/services/tree.py b/braindb/services/tree.py new file mode 100644 index 0000000..01c9216 --- /dev/null +++ b/braindb/services/tree.py @@ -0,0 +1,175 @@ +"""Shared tree-build service. + +`build_entity_tree` walks the relation graph bidirectionally from a root +entity and returns a nested JSON tree. The same function is used by the +HTTP endpoint `/api/v1/memory/tree/` and the agent's `view_tree` +tool — one shape, no behaviour drift. + +Shape (root keyed by ``entity_type``): + + { + "": "