Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 29 additions & 1 deletion BRAINDB_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,29 @@ The API runs at **http://localhost:8000**. Everything is done via HTTP calls.

---

## ⚠ TOOL PRIORITY (read this first)

BrainDB's value is the graph + embeddings + ranking. Use that power; do not
fall back to flat SQL.

1. **`POST /api/v1/memory/context`** — default for **query-driven** recall,
discovery, understanding ("what do we know about X?"). Keyword-mediated
fuzzy + embeddings + graph + ranking.
2. **`GET /api/v1/memory/tree/<id>?max_depth=N`** — reveals an entity's
neighbourhood as a nested JSON tree (root keyed by `entity_type`,
`children` arrays per node, keyword + retired-wiki noise filtered,
`_truncated` last-child marker if more remain). Especially useful when
you have an entity ID and want its graph context. On hub entities
(wikis with many connections) pass `max_depth=3` for narrative chains.
3. **`POST /api/v1/agent/query`** ("delegate to a subagent") — for multi-step
investigation / disambiguation.
4. `GET /api/v1/entities…` and `/entities/<id>/relations` — direct lookups.
5. **`POST /api/v1/memory/sql` ⚠ exception only** — aggregates (counts,
GROUP BY, activity-log joins). NEVER for recall, discovery, similarity,
understanding, or "what's around this entity" — those are the tools above.

---

## Entity Types

| Type | What to store |
Expand Down Expand Up @@ -270,7 +293,12 @@ curl -X POST http://localhost:8000/api/v1/memory/context \
```bash
curl http://localhost:8000/api/v1/memory/tree/<UUID>?max_depth=2
```
Returns the entity and all its graph connections organized by depth and relevance.
Returns the entity and its 1-N hop neighbourhood as a nested JSON tree (root
keyed by `entity_type`, `children` arrays per node, multi-path first-wins by
best accumulated path score, keyword + retired-wiki noise filtered by default).
Optional query params: `include_keywords` (default `false`), `top_k` (default
`40`), `min_path_score` (default `0.0`). A `_truncated` last-child marker
appears when `top_k` clips the result.

### Get All Rules
```bash
Expand Down
73 changes: 73 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,79 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.4.0] — 2026-06-03

Headline: a focused pass on recall quality and the `view_tree` tool. The
per-edge LLM judgment that was missing on `create_relation` is now wired
through to graph scoring, and `view_tree` returns a nested JSON tree the
agent can actually navigate (vs the depth-grouped text that silently
clipped 70% of connections on popular wikis).

### Changed

- **`view_tree` / `GET /api/v1/memory/tree/<id>` — nested JSON shape.**
Root keyed by `entity_type`, `children` arrays per node, multi-path
first-wins by best accumulated path score, keyword + retired-wiki noise
filtered by default, `_truncated` last-child marker when more remain.
One shared builder (`build_entity_tree` in `braindb/services/tree.py`)
for the HTTP endpoint and the agent tool — no behaviour drift. New
optional query params: `include_keywords` (default `false`), `top_k`
(default `40`), `min_path_score` (default `0.0`). Bench: Path B (Qwen
27B) 5/5 PASS, **−25% wall-clock**, **−26% tool calls**, **zero
`delegate` calls** on the hardest question (was 2). Path A (Claude
Code + curl skill) 5/5 PASS, `view_tree` usage 0 → 1-2 calls — the
structured shape is now usable in practice. Numbers in
`benchmarks/runs/round-2f_comparison.md`.
- **`create_relation` writes both edge scores.** The `importance_score`
column had been NULL for every agent-created row since day one; the
parameter is now on the tool, the watcher's extraction prompt no
longer dictates literal score values (the LLM judges per docstring),
and the graph CTE multiplies `relevance_score × COALESCE(
importance_score, 0.5) × depth_penalty` per hop. The
`is_bidirectional` field on `relations` is now ignored by graph
traversal — every edge walks both ways (matching what
`services/tree.py` already did). The field stays in the schema for
backwards compatibility.
- **Seed-similarity propagation in graph scoring.** Recall hops carry
the seed's similarity score forward through the graph; the depth
multiplier is softened so default-quality intermediates don't collapse
depth-2/3 results.
- **Prompts**: `view_tree` reframed as a capability for "explore around
this entity"; `search_sql` demoted to an aggregates-only exception.
Same wording applied to `system_prompt.md`, both skill files
(`skills/braindb/SKILL.md`, `skills/braindb-agent/SKILL.md`),
`README.md`, and `BRAINDB_GUIDE.md`.

### Fixed

- **`view_tree` keyword noise**: keyword entities were leaking through
non-`tagged_with` edges (e.g. `similar_to` between keywords). The
filter now applies to target `entity_type='keyword'` as well as the
edge type. Surfaced by the new nested JSON shape and fixed in the
same commit.
- **`view_tree` duplicate retired-wiki siblings**: when the wiki
maintainer creates a wiki and later consolidates duplicates, the
retired ones used to still appear in `view_tree` output as duplicate
siblings of the canonical wiki. The tree CTE now LEFT JOINs
`wikis_ext` and skips rows where `retired_at IS NOT NULL`.
- **Test isolation in `tests/test_ingest.py`**: the three ingest tests
used fixed content strings, so a previous run's row in the DB caused
dedup-by-hash to fire and the 201 assertion to fail on subsequent
runs. Each test now prepends a per-run `uuid.uuid4().hex` to its
content.
- **Missing test dep**: added `pytest-asyncio==0.23.7` to
`[project.optional-dependencies].dev`. Existing
`@pytest.mark.asyncio` tests were silently failing on clean installs.

### Upgrading from v0.3.0

No DB migration. No env-var changes. The wiki maintainer's existing
retired-wiki pipeline now also gates `view_tree` traversal — old wikis
with `retired_at IS NOT NULL` are silently skipped in tree output (they
remain readable via `GET /api/v1/entities/<id>`). `pyproject.toml`
version field was at `0.2.0` in v0.3.0's tagged release (the bump was
missed); this release catches it up to `0.4.0` in one step.

## [0.3.0] — 2026-05-25

Headline: a small read-only frontend lands so humans can browse the same
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ Drop a markdown file into `data/sources/` and the watcher sidecar picks it up wi
| POST | `/api/v1/entities/datasources/ingest` | Read a file from disk and create a datasource entity |
| POST | `/api/v1/memory/search` | Fast fuzzy search |
| POST | `/api/v1/memory/context` | Full retrieval: fuzzy → graph → decay → rank |
| GET | `/api/v1/memory/tree/{id}` | Entity graph tree — connections by depth |
| GET | `/api/v1/memory/tree/{id}` | Entity neighbourhood as a nested JSON tree |
| GET | `/api/v1/memory/log` | Activity log — when and how things happened |
| POST | `/api/v1/memory/sql` | Read-only SQL queries (SELECT/WITH only) |
| POST | `/api/v1/memory/generate-embeddings` | Batch-generate keyword embeddings |
Expand Down
34 changes: 20 additions & 14 deletions braindb/agent/prompts/system_prompt.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ CRITICAL — every assistant message MUST be a tool call; never plain prose. The
- `delete_entity(entity_id)`

**Relations:**
- `create_relation(from_entity_id, to_entity_id, relation_type, relevance_score, description)`
- `create_relation(from_entity_id, to_entity_id, relation_type, relevance_score, importance_score, description)`
- `view_entity_relations(entity_id)`
- `delete_relation(relation_id)`

Expand All @@ -50,24 +50,30 @@ CRITICAL — every assistant message MUST be a tool call; never plain prose. The
BrainDB's value is the graph + embeddings + ranking. Use that power; do not
fall back to flat SQL.

1. **`recall_memory`** — the default for ALL recall, discovery, and
understanding: multi-query fuzzy + full-text + **keyword-embedding** +
graph traversal + decay + ranking. This is almost always the right first
call.
2. **`delegate_to_subagent`** — for any multi-step investigation or
1. **`recall_memory`** — default for ALL **query-driven** recall, discovery,
and understanding: multi-query fuzzy + full-text + **keyword-embedding** +
graph traversal + decay + ranking. Use first when you don't yet know which
specific entity you want — "what do we know about X."
2. **`view_tree(<id>, max_depth=N)`** — the efficient "explore around this
entity" tool. Returns a nested JSON tree: root keyed by entity_type,
`children` arrays per node, 1-N hops out, keyword/retired-wiki noise
filtered, `_truncated` marker if more remain. When you have an entity
ID and want what's connected, this is usually a better next step than
another `recall_memory`. On hub entities (wikis), pass `max_depth=3`
to see narrative chains.
3. **`delegate_to_subagent`** — for any multi-step investigation or
disambiguation ("is this the same person/thing?", "find and resolve X").
A fresh agent with the full toolset; returns a summary. Prefer this over
doing a long crawl yourself.
3. `view_tree` / `view_entity_relations` / `get_entity` / `list_entities` —
targeted structure lookups.
4. **`search_sql` — exception only.** A blunt SELECT has no embeddings, no
graph, no ranking — it throws away everything BrainDB is good at. Use it
*only* for a specific structured/aggregate question the tools above cannot
express (counts, GROUP BY, activity-log joins). Never for recall,
discovery, similarity, or understanding.
4. `view_entity_relations` / `get_entity` / `list_entities` — direct lookups
(single-hop relations, full body of one entity, listing by filter).
5. **`search_sql` ⚠ exception only — aggregates only** (counts, GROUP BY,
activity-log joins, schema inspection). A blunt SELECT has no embeddings,
no graph, no ranking — it throws away everything BrainDB is good at.
Never for recall, discovery, similarity, or understanding.

If you reach for `search_sql` to "find" or "understand" something, stop —
that's a `recall_memory` or `delegate_to_subagent` job.
that's a `recall_memory` or `view_tree` or `delegate_to_subagent` job.

## READING CONTENT — previews vs the full body

Expand Down
55 changes: 27 additions & 28 deletions braindb/agent/tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,10 +112,10 @@ async def recall_memory(
queries: list[str],
max_results: int = settings.recall_default_max_results,
) -> str:
"""Search BrainDB memory with multiple natural language queries.
"""⭐ Primary recall tool — use FIRST for ANY "what do we know about X" question.

Runs fuzzy + fulltext + keyword embedding search, merges with geometric mean,
traverses the graph up to 3 hops, applies temporal decay.
Use this as the primary recall tool.

QUERY STRATEGY — IMPORTANT for high-recall on narrow subjects:

Expand Down Expand Up @@ -542,7 +542,8 @@ async def create_relation(
from_entity_id: str,
to_entity_id: str,
relation_type: str,
relevance_score: float = 0.7,
relevance_score: float = 0.5,
importance_score: float = 0.5,
description: Optional[str] = None,
) -> str:
"""Create a relation between two entities.
Expand All @@ -551,17 +552,18 @@ async def create_relation(
from_entity_id: Source entity UUID.
to_entity_id: Target entity UUID.
relation_type: One of: supports, contradicts, elaborates, refers_to, derived_from, similar_to, is_example_of, challenges, tagged_with.
relevance_score: 0-1 (default 0.7).
relevance_score: 0-1 — how tight the semantic link is (0.9 = strong, 0.5 = neutral / "didn't judge", 0.2 = weak).
importance_score: 0-1 — how much losing this edge would degrade recall (0.9 = critical, 0.5 = neutral / "didn't judge", 0.2 = trivial).
description: Why this relation exists.
"""
try:
with get_conn() as conn:
with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as cur:
try:
cur.execute(
"""INSERT INTO relations (from_entity_id, to_entity_id, relation_type, relevance_score, description)
VALUES (%s, %s, %s, %s, %s) RETURNING id""",
(from_entity_id, to_entity_id, relation_type, relevance_score, description),
"""INSERT INTO relations (from_entity_id, to_entity_id, relation_type, relevance_score, importance_score, description)
VALUES (%s, %s, %s, %s, %s, %s) RETURNING id""",
(from_entity_id, to_entity_id, relation_type, relevance_score, importance_score, description),
)
rid = cur.fetchone()["id"]
except psycopg2.errors.UniqueViolation:
Expand Down Expand Up @@ -634,41 +636,38 @@ async def delete_relation(relation_id: str) -> str:
@function_tool
@_verbose("view_tree")
async def view_tree(entity_id: str, max_depth: int = 2) -> str:
"""View the graph of connections around an entity (incoming + outgoing).
"""⭐ Reveals an entity's neighbourhood as a nested JSON tree:
root keyed by ``entity_type``, ``children`` arrays per node, multi-path
first-wins, keyword/retired-wiki noise filtered, ``_truncated`` marker
when more remain. Especially useful when you have an entity ID (from a
previous result) and want its graph context — often a sharper choice
than another `recall_memory` about the same entity. Pass `max_depth=3`
on hub entities (wikis with many connections) to see narrative chains.

Args:
entity_id: UUID of the root entity.
max_depth: How far to traverse (1-3, default 2).
"""
if max_depth < 1 or max_depth > 3:
return _err("max_depth must be 1-3")
try:
from braindb.services.tree import build_entity_tree
with get_conn() as conn:
with conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor) as cur:
cur.execute(
"""SELECT e.*, r.relation_type, r.relevance_score, r.description AS rel_desc,
CASE WHEN r.from_entity_id = %s THEN 'out' ELSE 'in' END AS dir
FROM relations r
JOIN entities e ON e.id = CASE WHEN r.from_entity_id = %s THEN r.to_entity_id ELSE r.from_entity_id END
WHERE r.from_entity_id = %s OR r.to_entity_id = %s""",
(entity_id, entity_id, entity_id, entity_id),
)
rows = [dict(r) for r in cur.fetchall()]
if not rows:
return "No connections."
lines = [f"{len(rows)} connections from {entity_id}:"]
for r in rows:
lines.append(
f" [{r['dir']}] {r['relation_type']} (rel={r['relevance_score']})\n"
f" [{r['entity_type']}] {r['content'][:80]} (id: {r['id']})"
)
return _truncate("\n".join(lines))
tree = build_entity_tree(conn, entity_id, max_depth=max_depth)
if tree is None:
return _err(f"Entity not found: {entity_id}")
return _truncate(json.dumps(tree, indent=2, default=str, ensure_ascii=False))
except Exception as e:
return _err(str(e))


@function_tool
@_verbose("search_sql")
async def search_sql(query: str) -> str:
"""Run a read-only SQL query (SELECT/WITH only). Use for complex exploration.
"""⚠ Aggregates ONLY (counts, GROUP BY, joins for stats). NEVER for recall /
discovery / understanding — that's recall_memory. NEVER for "what's around
this entity" — that's view_tree. If you're using SQL to find or understand
something, stop and pick the right tool.

Args:
query: SQL query — must start with SELECT or WITH.
Expand Down
17 changes: 10 additions & 7 deletions braindb/ingest_watcher.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
CHUNK_OVERLAP = 75 # words of overlap between adjacent chunks — catches facts that span a boundary

INGEST_TIMEOUT = 60
AGENT_TIMEOUT = 600 # NIM free tier is slow/flaky on gemma-4-31b; generous timeout gives retries room to succeed
AGENT_TIMEOUT = int(os.getenv("AGENT_TIMEOUT", "1200")) # env-overridable; matches wiki_scheduler pattern. Generous default for slow / deep-exploring LLM runs.
UUID_RE = re.compile(r"[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}")

logging.basicConfig(
Expand Down Expand Up @@ -157,13 +157,15 @@ def extract_facts_from_chunk(ds_id: str, title: str, idx: int, total: int, chunk
f"numbers, events, named decisions. Ignore filler, opinion, and generic\n"
f"statements. Aim for quality over quantity: typically 3-8 facts per chunk.\n\n"
f"For EACH fact:\n"
f' a) Call save_fact(content="<the fact in one sentence>", certainty=0.8,\n'
f' source="document", keywords=[2-4 precise tags], importance=0.6,\n'
f' notes="Extracted from {title} chunk {idx}/{total}"). Record the\n'
f" returned fact id.\n"
f' a) Call save_fact(content="<the fact in one sentence>",\n'
f' source="document", keywords=[2-4 precise tags],\n'
f' notes="Extracted from {title} chunk {idx}/{total}"). Judge\n'
f" certainty and importance per the tool's docstring guidance.\n"
f" Record the returned fact id.\n"
f" b) Call create_relation(from_entity_id=<fact_id>, "
f'to_entity_id="{ds_id}", relation_type="derived_from",\n'
f' relevance_score=0.9, description="Fact extracted from {title}").\n\n'
f' description="Fact extracted from {title}"). Judge\n'
f" relevance_score and importance_score per the tool's docstring.\n\n"
f"Do NOT call get_entity. Do NOT call update_entity on the datasource.\n"
f"Do NOT touch the datasource content — it is read-only.\n\n"
f"When all facts in this chunk are processed, call final_answer with\n"
Expand Down Expand Up @@ -210,7 +212,8 @@ def central_review(ds_id: str, title: str, fact_ids: list[str]) -> None:
f" or is_example_of. Only create relations that are genuinely meaningful.\n"
f"2. If the set as a whole suggests a broader observation or inference that\n"
f" none of the individual facts capture, call save_thought with that\n"
f" thought (certainty=0.6-0.8, source='agent-inference') and then\n"
f" thought (source='agent-inference'; judge certainty + importance per\n"
f" the tool docstring) and then\n"
f' create_relation(thought_id, "{ds_id}", "elaborates").\n'
f"3. Optionally run recall_memory with 1-2 queries derived from the facts\n"
f" to find related EXISTING entities in memory. If any are clearly\n"
Expand Down
Loading
Loading