fix: gbrain export auto-appends .md to internal slug-form links by vinsew · Pull Request #123 · garrytan/gbrain

vinsew · 2026-04-14T19:24:22Z

Summary

GBrain stores internal cross-page references in slug form (e.g. [Alice](./alice)) because the slug is the canonical identifier in the DB. That works inside GBrain's own resolution layer.

But when those pages are exported as .md files on disk and opened in standard markdown viewers (Obsidian, VS Code preview, GitHub web view, mkdocs/jekyll/hugo renderers), the viewers look for a literal file at ./alice — which doesn't exist. The actual file is ./alice.md.

Result: every internal link in an exported brain is silently broken on disk. The user clicks [小龙] in 龙虾群.md, sees a 404, and cannot navigate the brain outside of GBrain itself. This defeats half the value of having the brain stored as portable markdown.

Reproduction

mkdir /tmp/repro && cd /tmp/repro && git init
gbrain init
echo '[Alice](./alice)' | gbrain put 龙虾群
gbrain put alice <<< '# Alice'
gbrain export --dir /tmp/out

# Before this PR:
$ cat /tmp/out/龙虾群.md | grep Alice
[Alice](./alice)              # → click 404s in any markdown viewer

# After this PR:
$ cat /tmp/out/龙虾群.md | grep Alice
[Alice](./alice.md)           # → click opens the file

Real-world hit: a Chinese-speaking deployment where Hermes (an agent) generated 14 entity pages with internal cross-links. Every link in the exported .md files was broken, even though gbrain query worked perfectly.

Fix

Add normalizeInternalLinks(content) in src/commands/export.ts that runs over each page's serialized markdown right before writeFileSync and rewrites slug-form internal links to filename-form by appending .md.

export function normalizeInternalLinks(content: string): string {
  return content.replace(/(\[[^\]]+\])\(([^)]+)\)/g, (match, label, target) => {
    if (!target || target.startsWith('#')) return match;
    if (/^[a-z][a-z0-9+.-]*:/i.test(target)) return match; // http:, mailto:, ftp:, file:, ...
    const withoutFragment = target.split(/[?#]/)[0];
    const basename = withoutFragment.split('/').pop() || '';
    if (/\.[a-z0-9]{1,5}$/i.test(basename)) return match; // already extended
    if (!basename) return match; // trailing slash
    const fragment = target.slice(withoutFragment.length);
    return `${label}(${withoutFragment}.md${fragment})`;
  });
}

Conservative — leaves untouched anything that looks external or already extended:

Case	Behavior
URL schemes (http:, mailto:, ftp:, ...)	skip
Anchors (`#section`)	skip
Empty targets	skip
Trailing slash (directory)	skip
Already extended (`.md`, `.png`, ...)	skip
Anchors / queries on slugs	preserved when appending

[Section](./alice#bio)   →   [Section](./alice.md#bio)
[Search](./alice?q=t)    →   [Search](./alice.md?q=t)

The DB content stays slug-form (GBrain's internal convention is unchanged). Only the on-disk export gets the .md annotation.

Impact

2 files changed, +209 / -1 lines (1 line of helper invocation + ~40 lines of helper + 26 tests)
Zero behavior change for external URLs, anchors, or already-extended links
Idempotent: re-running export on an already-exported brain is a no-op (already-extended links are skipped)
Re-import safe: gbrain sync reading back the .md-annotated files normalizes paths back to slugs via existing slugify logic — round-trip works

Test plan

26 new tests in test/export.test.ts covering:
- Same-dir slug, parent-dir slug, deep nesting, CJK slugs, multiple links per line, multi-line markdown
- All 6 external schemes (http/https/mailto/file/ftp/tel)
- All 4 extension cases (.md / .png / .pdf / uppercase .MD)
- Anchor preservation, query preservation
- Empty / trailing-slash / no-link edge cases
All 26 export tests pass
Full bun test: 612 pass, no new regressions (the 4 pre-existing PGLiteEngine failures are unrelated and exist on master)

Context

Fifth in a series of small, focused PRs from a real Chinese-speaking deployment. Companion to:

fix: CJK word count and delimiters in recursive chunker #114 (chunker CJK word count)
fix: preserve CJK characters in slugify, prevent silent collision #115 (slugify CJK preservation)
fix: preserve CJK paths in gbrain sync (core.quotepath=false) #119 (sync git diff CJK quotepath)
feat: self-contained API keys (read from gbrain's own config, not just env) #121 (self-contained API keys)

Same theme: GBrain is meaningfully more useful when the markdown export is a first-class deliverable, not a half-broken side-effect. Combined with #121 (self-contained keys), users running gbrain export from cron / agent subprocess now get a fully-functional, portable markdown brain on disk — true to the "markdown is the source of truth" architecture this project advocates.

GBrain stores internal cross-page references in slug form (e.g. `[Alice](./alice)`) because the slug is the canonical identifier in the DB. That works inside GBrain's own resolution layer. But when those pages are exported as `.md` files on disk and opened in standard markdown viewers (Obsidian, VS Code preview, GitHub web view, typical mkdocs/jekyll renderers), the viewers look for a literal file at `./alice` — which doesn't exist. The actual file is `./alice.md`. Result: every internal link in an exported brain is silently broken on disk. The user clicks `[小龙]` in `龙虾群.md`, sees a 404 / empty page, and cannot navigate the brain outside of GBrain itself. This defeats half the value of having the brain stored as portable markdown. Fix: Add `normalizeInternalLinks(content)` that runs over each page's serialized markdown right before `writeFileSync` and rewrites slug-form internal links to filename-form by appending `.md`: [Alice](./alice) -> [Alice](./alice.md) [Alice](alice) -> [Alice](alice.md) [Alice](../people/alice) -> [Alice](../people/alice.md) [小龙](../people/小龙) -> [小龙](../people/小龙.md) Conservative: leaves untouched anything that looks external or already extended: - URL schemes (http:, https:, mailto:, ftp:, file:, tel:, ...) — skip - Anchors (#section) — skip - Empty targets — skip - Trailing slash (directory references) — skip - Already has any extension (.md, .png, .pdf, .MD, ...) — skip - Preserves query strings and anchors when appending: [Section](./alice#bio) -> [Section](./alice.md#bio) [Search](./alice?q=t) -> [Search](./alice.md?q=t) The DB content stays slug-form (GBrain's internal convention is unchanged). Only the on-disk export gets the `.md` annotation, so the exported markdown is viewable as-is by any standard renderer. Real-world reproduction this fix addresses: $ gbrain put 龙虾群 < <(echo '[小龙](./小龙)') $ gbrain export --dir /tmp/out $ cat /tmp/out/龙虾群.md # before this PR: contains [小龙](./小龙) — clicking 404s # after this PR: contains [小龙](./小龙.md) — clicking opens the file Impact: - 2 files changed, +149 / -1 lines (1 line of helper invocation + ~40 lines of helper + comment + 26 tests) - Zero behavior change for external URLs, anchors, or already-extended links - DB content unchanged — only the on-disk export representation gains the `.md` annotation - Existing exports remain valid (re-running export on an already-exported brain is idempotent because already-extended links are skipped) Tests: - 26 new tests covering: same-dir slug, parent-dir slug, deep nesting, CJK slugs, multiple links per line, multi-line markdown, all 6 external schemes (http/https/mailto/file/ftp/tel), all 4 extension cases (md/png/pdf/uppercase), anchor preservation, query preservation, empty/trailing-slash/no-link edge cases. - All 26 tests pass. - Full suite: 612 pass / no new regressions (4 pre-existing PGLiteEngine failures are unrelated and exist on master). Fifth in a series of practical PRs from a real Chinese-speaking deploy. Companion to: - garrytan#114 (chunker CJK) - garrytan#115 (slugify CJK) - garrytan#119 (sync git quotepath CJK) - garrytan#121 (self-contained API keys) Same theme: GBrain is meaningfully more useful when the markdown export is a first-class deliverable, not a half-broken side-effect.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: gbrain export auto-appends .md to internal slug-form links#123

fix: gbrain export auto-appends .md to internal slug-form links#123
vinsew wants to merge 1 commit intogarrytan:masterfrom
vinsew:fix/export-internal-link-md

vinsew commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vinsew commented Apr 14, 2026

Summary

Reproduction

Fix

Impact

Test plan

Context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant