Skip to content

refactor: add reusable extensions core plumbing#1

Open
JSRRosenbaum wants to merge 25 commits into
mainfrom
feat/extensions-core
Open

refactor: add reusable extensions core plumbing#1
JSRRosenbaum wants to merge 25 commits into
mainfrom
feat/extensions-core

Conversation

@JSRRosenbaum

Copy link
Copy Markdown
Owner

Summary

  • add the source-agnostic plumbing needed for extension branches to layer HN / GitHub Stars / Reddit / Web on top of current FT
  • keep this branch limited to reusable core pieces only

What changed

  • add reusable Firefox profile and cookie discovery improvements
  • add reusable status rendering helpers
  • add reusable namespace/path helpers
  • add classify --limit support and portable Firefox config test coverage

Key commits

  • 9e0b3f5 feat: add classify limit and portable firefox config test
  • f809793 refactor: improve shared firefox profile discovery
  • e8f60b6 refactor: add reusable core status and path helpers

Verification

  • npx tsx --test tests/config.test.ts tests/engine.test.ts tests/firefox-cookies.test.ts tests/firefox-profile-detection.test.ts tests/namespace-helpers.test.ts
  • npm run build

Notes

  • this is the base branch for the multi-source extension stack
  • no HN / stars / reddit / web feature code lives here directly

afar1 and others added 25 commits April 10, 2026 09:01
Thanks @JSRRosenbaum for PR afar1#64 and @Bortlesboat for PR afar1#61, which both identified the Firefox Windows sync gap and helped shape this rewrite.
[codex] Harden Firefox session sync on Windows
…es-and-media

[codex] Fix GraphQL bookmark dates and fetch-media post assets
afar1#67)

Thank you @kyledenis for the fix! This was exactly the minimal change needed — `getLocalVersion()` was already in place and being used elsewhere in the file, so swapping the hardcoded string for the helper was the right call.

Closes afar1#56.
- Add Gap 3 to ft sync --gaps: fetches article content for link-only bookmarks
- New bookmark-enrich module with HTML extraction, SSRF protection, 2MB size limit
- Schema v5: article_title, article_text, article_site, enriched_at columns
- FTS5 rebuilt with article_text (BM25 weight 3.0) for searchable article content
- Security: file permissions 0o600, FTS5 query sanitizer, parameterized schema version
- Add ft sync --folders/--folder and ft list --folder with mirror semantics (read-only)
- Fix SSRF via redirect chains, add fsync durability, heal latent schema migration bug
- Expand FTS sanitizer to handle parens/+/-/column filters; 210 tests pass
- Release bookmark folders + security hardening (landed in afar1#81)
- Adds WHATS_NEW entry so the post-update notice lists new features
- Document ft sync --folders/--folder and ft list --folder and ft folders
- Fix stale --full flag reference (renamed to --rebuild in v1.3.0)
- Update browser support table to match actual registry (add Chromium/Helium/Comet, drop Arc)
…fy-domains (afar1#84)

Adds a per-invocation LLM engine override to the three classification
entry points. Takes the form `--engine <name>` (e.g. `claude`, `codex`)
and threads the choice into resolveEngine() as options.override.

Override semantics: bypasses saved preferences and interactive prompting,
fails fast if the named engine is unknown or its binary isn't on PATH.
For `ft sync --classify --engine <name>`, the check runs before sync
starts so a missing engine doesn't cost a full network sync.

Auto-resolution (no --engine flag) is unchanged.

Picks up the spirit of @ASRagab's PR afar1#16, which was written before
main's src/engine.ts landed and ended up architecturally stale. This
version plugs into the existing engine abstraction — ~95 lines instead
of 236, and inherits cross-platform detection, preference persistence,
and the async invocation path.

Co-authored-by: Asim Ragab <ASRagab@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Plain bracket lookup walked the prototype chain, so
`--engine __proto__` (or constructor/toString) hit a confusing
"Engine not on PATH" error instead of a clean "Unknown engine".
Not exploitable — the static KNOWN_ENGINES registry still
prevented any arbitrary binary from being invoked, and
hasCommandOnPath(undefined) returned false — but the allowlist
should reject prototype keys up front.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ft wiki used a single-line spinner that overwrote each progress
message, so a multi-minute LLM call looked identical to a hang
and log.md was only written at the very end. Users reported
2+ hour silent spins with no logs to inspect.

- Replace the spinner with scrolling stderr output, one line
  per event, so every page start/completion is visible.
- Stream per-page events to log.md as they happen via a local
  logLine helper, so `tail -f ~/.ft-bookmarks/md/log.md` from
  another shell now works mid-compile.
- Print an upfront "Follow live: tail -f ..." hint pointing at
  the log path.
- Preserve the Ctrl-C "Your data is safe" message via an inline
  SIGINT handler.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(wiki): strip LLM code fences from generated pages

LLM engines (Claude CLI, etc.) sometimes wrap their entire markdown
response in a ```markdown ... ``` code block. When that lands on disk
it renders the whole wiki page as a code block in editors like
Obsidian instead of rendering as markdown. Of 167 pages in the
author's wiki, 139 had this corruption.

- New pure helper `stripLlmMarkdownFence` that handles three observed
  shapes: full wrap, partial strip (language tag left over), orphan
  trailing fence. Idempotent and CRLF-safe.
- Prevention: call the helper inside `compileMd` right after every
  `invokeEngineAsync`, so new pages are born clean.
- Opportunistic cleanup: after a successful `ft sync`, quietly walk
  the wiki subdirs and fix any files that still have the bug. Silent
  when clean, one-line summary when it repaired something. Skips if
  `.lock` exists to avoid racing an in-progress compile.
- Explicit cleanup: `ft wiki --clean` runs a standalone rewrite with
  a timestamped backup dir at `md/.fence-backup-<ts>/` before touching
  anything, so the operation is reversible.
- 14 unit tests covering all three cases, CRLF, idempotency, and the
  guard that prevents stripping legitimate content beginning with the
  word "markdown".

Ran end-to-end on the author's wiki: 139/167 pages cleaned in one
pass, subsequent runs are a no-op, backup preserved.

Naming note: user originally asked for `ft md --clean`, but `ft md`
is the bookmark-export command. The corrupted files come from
`ft wiki`, so the flag lives there. Rename trivially if preferred.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(wiki): add integration tests for cleanWikiFences walker

Pure helper was covered; the walker / backup / lock-skip paths were not.
Six integration tests that set FT_DATA_DIR to a tmp wiki and exercise:

- fixes broken files across categories/domains/entities subdirs
- idempotent — second run is a no-op
- backup option preserves originals in timestamped dir
- backup dir is null when nothing needed fixing
- skips entirely when .lock file is present (compile safety)
- returns empty result when wiki dir does not exist

Also fixes a misleading comment on Case C in stripLlmMarkdownFence:
it's not a "somehow just the closing fence survived" edge case, it's
the second half of Case B's partial-strip cleanup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: harden codex engine args and dedupe profile media

* fix: retry failed profile image downloads
…er prompts (afar1#90)

* fix(wiki): --engine override, consecutive-failure breaker, friendlier prompts

Addresses a user report of `ft wiki` against ~4000 bookmarks hanging with
no progress and exiting "Cancelled before selecting a model" after 24h —
a combination of (a) no way to skip the interactive engine prompt on the
wiki command, (b) no guard against long cascades of identical LLM
failures (auth expiry, rate limits), and (c) a raw error message with no
actionable fix.

Changes:

- **`ft wiki --engine <name>`** — mirrors PR afar1#84's pattern via the existing
  `engineOption()` helper; threads through `CompileOptions.engineOverride`
  to `resolveEngine({ override })`. Lets users bypass interactive selection
  entirely.

- **Consecutive-failure breaker** — aborts the compile after 5 consecutive
  page failures, logs the first error verbatim, and suggests checking
  engine auth. Catches the "auth expired mid-run, watch 4000 identical
  failures scroll" failure mode. Counter resets on every successful page.

- **Friendlier `PromptCancelledError` messages** — both interrupt and
  close paths now point at `ft model <engine>` and `--engine`.

- **Echo engine name early** — `progress(`Using ${engine.name}`)`
  immediately after `resolveEngine()` returns, so the selection is
  visible before the scan phase instead of only in the later "Generating
  N pages" banner.

- **Lock-file error wording** — says `ft wiki` not `ft md`.

- **Help text** — `.description()` now notes the claude/codex prerequisite.

Tests:

- `MAX_CONSECUTIVE_FAILURES` exported + sanity-checked in
  `tests/md.test.ts` to prevent accidental regression to 0.
- `tests/cli.test.ts` asserts `ft wiki` registers `--engine` and the
  description mentions both engines.
- Full suite: 241 passing (was 238).

Deferred as follow-ups: `--dry-run` (plan-only mode), SIGHUP handler for
`nohup`-style background runs, `--only`/`--max` partial-run flags.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(wiki): distinguish aborted compile from completed one in summary

Security-review follow-up to the previous commit. When the consecutive-
failure breaker fires, `compileMd` used to return normally, so the cli
printed `Done (Xs)` followed by the generic `N page(s) failed — re-run
ft wiki to retry them.` advice. Both contradicted the breaker's own
in-run guidance to "check engine auth".

- Add `aborted: boolean` to `CompileResult`; set it to `true` in the
  breaker path before `break`.
- `cli.ts` wiki handler now branches on `result.aborted`: prints
  `Aborted (Xs)`, suppresses the generic retry advice, echoes the
  breaker's engine-auth hint, and sets `process.exitCode = 1` so shell
  scripts can detect the failure.
- Final `log.md` entry prefixes `aborted ` when applicable.

No test changes — behavior is surfaced via runtime flag wiring; existing
suite covers the type and option parsing. All 241 tests still pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ices (afar1#92)

* fix(cli): fire update-check from showDashboard so no-args ft sees notices

The no-args `ft` path in `bin/ft.mjs` bypasses Commander entirely and
calls `showDashboard()` directly. The update-check is wired as a
Commander `postAction` hook (`cli.ts:1438`), so for the most common
entry point — running `ft` to glance at the dashboard — the update
notice **never** fires. Same for `ft --version` and `ft --help` (both
are Commander built-ins that exit before the hook).

Effect: users who primarily use the dashboard view can run an ancient
version of the CLI forever without ever being told an update exists.
There's a particularly dark case where a user on a hanging `ft wiki`
(see afar1#60, fixed in afar1#90) never reaches `postAction` at all — the users
most in need of a fix are the least able to discover it through the
CLI itself.

Fix: call `checkForUpdate()` at the end of `showDashboard()`. Same
semantics the subcommand path already tolerates:
- 5s AbortController on the fetch
- 24h cache debounce (instant path on cache-hit days, ~1/24 runs pay
  the network cost)
- Top-level try/catch that swallows every error class

So the dashboard has exactly the same brittleness envelope as any
existing subcommand. Zero new failure modes.

Test: `tests/cli.test.ts` adds `showDashboard: prints update notice
when cache is newer than local`. Writes a `99.99.99` cache into a
temp `FT_DATA_DIR`, invokes `showDashboard`, asserts the notice
appears in captured `console.log` output. Exercises the real cache-hit
code path with no mocking and no network.

Suite: 242 pass (was 241).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(cli): assert cached version number appears in update notice

Refactor-pass tightening of the regression test added in the previous
commit. Original assertion only checked for the literal "Update
available" string, which would pass even if showCachedUpdateNotice
regressed to printing an empty or wrong version. Adding a check for
"99.99.99" (the sentinel written into the test cache) catches that
class of regression.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…1#60) (afar1#94)

* fix(wiki): close child stdin so `ft wiki` stops hanging on claude -p

Root cause for issue afar1#60's "claude authentication" errors, which were not
actually auth errors at all.

## The hang

`claude -p` reads stdin when it's not a TTY and concatenates it with the
`-p` argument. Leaving stdin as an open pipe that's never written makes
older claude versions block on read() forever, and newer versions eat a
3-second "no stdin data received" penalty per call. The old async path
used `execFile(callback, ...)`, which builds its own internal stdio and
silently ignores any `stdio` option passed in — so we couldn't actually
close stdin through the execFile API even when we tried.

Brian's log in afar1#60 is the smoking gun: five pages with timeouts of 120,
174, 142, 132, 152s, total elapsed 721s. Every page ran to its *full*
timeout before dying. That's not rate-limiting (instant fail) or auth
(instant fail) — that's a hung read on an unclosed pipe.

`claude` working in the terminal is consistent with this bug, because
interactive runs have a TTY on stdin and never enter the stdin-reading
code path.

## Fix

Both invocations now use `spawn` / `spawnSync`:

- `invokeEngineAsync` uses `spawn` + `child.stdin.end()` immediately, so
  the child sees EOF on stdin and proceeds with just the prompt arg.
- `invokeEngine` (sync) uses `spawnSync` with `input: ''`, same effect.

Verified end-to-end against real `claude` 2.1.108: the "no stdin data
received in 3s" warning is gone, and calls return promptly.

## Error shape

The old path threw `Error("Command failed: claude -p --output-format text
<WHOLE PROMPT INLINED>")`. `src/md.ts` then did `.slice(0, 120)` to bound
the log entry, but the prompt consumed the entire budget, so `log.md`
showed only the command prefix and zero signal about why the child died.
Worse, the timeout classifier checked `msg.includes('ETIMEDOUT')` — a
string that never appears in Node's actual timeout-error message — so
every timeout was mislabeled `ERROR:` instead of `TIMEOUT:`, hiding the
pattern from everyone debugging the issue.

The new `EngineInvocationError` carries structured fields:

    { engine, bin, stderr, killed, code, signal, reason, message }

...where `reason` is one of `'timeout' | 'maxbuffer' | 'exit' | 'spawn'`.
`md.ts` now prefers stderr content when present, uses the reason code
for the log label, and tailors the breaker's "check engine auth" hint
based on what actually happened — e.g. a timeout cascade now reads
"ran to the full timeout on every page — usually a hung child, not
auth. Upgrade `claude` and retry" instead of a generic auth guess.

## Bonus: unclassified placeholder

`getCategoryCounts` reads `primary_category`, but `sampleByCategory`
reads the `categories` column with `LIKE '%<name>%'`. Fresh bookmarks
default to `primary_category='unclassified'` + `categories=NULL`, so
the count query surfaces "unclassified: N" (often N == total bookmarks)
while the sampler always returns zero rows. `ft wiki` was queuing an
"unclassified" page with 0 source bookmarks on every compile, then
wasting a full timeout on that LLM call too.

Excluded `'unclassified'` from `getCategoryCounts` with a comment
explaining why — it's a placeholder for unclassified rows, not a real
category. Also benefits the dashboard top-7 display, `ft categories`,
and `md-lint`'s uncovered-category check, which were all showing or
flagging the placeholder as a real category.

## Tests

Seven new regression tests (all passing):

- tests/engine.test.ts:
  * invokeEngineAsync closes stdin with EOF (fake script reads stdin
    with dd, we assert it sees EOF in <1.5s, not a 2s hang)
  * invokeEngine sync equivalent
  * non-zero exit throws EngineInvocationError with real stderr content
    and no prompt inlined in .message
  * timeout throws reason='timeout', killed=true, .message does not
    inline prompt
  * ENOENT spawn failure throws reason='spawn'

- tests/bookmarks-db.test.ts:
  * getCategoryCounts excludes 'unclassified' placeholder
  * getCategoryCounts still returns real categories alongside the
    exclusion

The one pre-existing failing test (`loadChromeSessionConfig: --browser
firefox`) is unrelated — it expects macOS paths and fails on Linux both
with and without this change.

https://claude.ai/code/session_01SvSvEYi4E6i6bbvFizaGCJ

* test(engine): regression for invokeEngineAsync maxBuffer path

Async variant implements maxBuffer by hand (no callback-level option),
and that branch was the only new invokeEngineAsync code path without a
regression test. Fake engine emits 64 KiB then sleep 30; assert we trip
the cap in <5s with reason='maxbuffer' and killed=true so the sleep
never runs to completion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(engine): address security-review findings on invokeEngineAsync

Four targeted hardening fixes against the invokeEngineAsync path
introduced in 4a1533b. None are critical in practice against today's
`claude` / `codex`, but each closes a defense-in-depth gap worth
having.

1. fail() clears the timeout timer (was: medium)

   When maxBuffer or the stdout handler tripped fail(), the timeout
   timer was left armed until close() eventually fired. Not a leak in
   practice (close always arrives) but it kept the event loop alive
   longer than needed and would become a real leak if any future path
   ever skipped close. `fail()` and `succeed()` now clearTimeout up
   front, and `child.on('error')` / `child.on('close')` clear
   unconditionally.

2. SIGKILL escalation after SIGTERM (was: low)

   A misbehaving engine that traps and ignores SIGTERM used to leak a
   running child. New `killChild()` helper sends SIGTERM, then schedules
   an unref'd SIGKILL after SIGKILL_GRACE_MS (2s). The `.unref()` is
   load-bearing: without it the escalation timer would block process
   exit on the happy path where the child dies immediately.

3. Byte-bounded stderr buffering (was: low)

   Old bound was `stderrChunks.length > 64` with a 32-chunk splice —
   chunk *count*, not bytes. A misbehaving engine could push 64 giant
   chunks and park ~64 * chunk_size MB in memory. Replaced with a
   STDERR_HARD_CAP (64 KiB) byte budget that drops the oldest chunks
   first, keeping at least one chunk so a single giant line still shows
   its tail. The final tail shown to users is still clipped to
   STDERR_TAIL_BYTES (4 KiB) by tailString.

4. Secret redaction at the engine boundary (was: low)

   Child stderr lands on EngineInvocationError.stderr and eventually in
   ~/.ft-bookmarks/md/log.md. If an engine ever echoed a secret to
   stderr, it'd persist forever. New `redactSecrets()` masks three
   high-confidence shapes before the stderr reaches the error object:

     - sk-…  (Anthropic / OpenAI / Stripe style)
     - ghp_ / gho_ / ghu_ / ghs_ / ghr_  (GitHub tokens)
     - Bearer <token>  (Authorization headers)

   Narrow by design — these patterns are ~impossible to collide with
   legitimate error text, so no false positives. Applied at both the
   async and sync invocation paths so all call sites get it for free.
   Exported so md.ts and tests can use it directly.

## Tests (+6)

- redactSecrets: 4 unit tests covering sk-, ghp-family, Bearer, and
  normal-text non-interference
- invokeEngineAsync: stderr bound + redaction E2E — spams 100 KiB of
  stderr noise, then writes an error line with a fake `sk-…` token;
  asserts the stored stderr is clipped, the secret is redacted, and
  `.message` doesn't contain the raw token either
- invokeEngineAsync: SIGKILL escalation — fake engine traps SIGTERM
  and sleeps; asserts the promise rejects near the timeout (SIGTERM
  path, not waiting for SIGKILL) and leaves room for the 2s grace
  window so the escalation has a chance to land before test teardown

All 256 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* bump v1.3.8

Fixes the `ft wiki` stdin hang (afar1#60) — `claude -p` was reading
non-TTY stdin and blocking on an unclosed pipe, running every page
to its full timeout. Also ships secret redaction and SIGKILL
escalation in the engine invocation path as defense-in-depth, and
excludes the `unclassified` placeholder from `getCategoryCounts`
so fresh compiles no longer burn a timeout on a 0-sample page.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
… (v1.3.9) (afar1#100)

* fix(gaps): capture full long-form note_tweets in sync and gap-fill

Root cause: `GRAPHQL_FEATURES` omitted `longform_notetweets_consumption_enabled`,
so the bookmarks-feed and folder-timeline responses never included the
`note_tweet` block. Every long-form tweet synced via `ft sync` was silently
stored at the 275-char `legacy.full_text` preview. `ft sync --gaps` couldn't
recover them because syndication only returns a `note_tweet.id` stub, never
the body.

Changes:
- Enable `longform_notetweets_consumption_enabled` so every `ft sync` captures
  full note_tweets on first touch (bookmarks feed + folder timelines).
- `convertTweetToRecord` quoted-tweet path now reads `note_tweet` body first,
  so bookmarks whose quoted tweet is a note_tweet also land full-length.
- New `fetchTweetByIdViaGraphQL` hits TweetResultByRestId with the consumption
  flag set; mirrors the existing cookie / retry / backoff patterns.
- `syncGaps` rewired to use GraphQL first (with syndication as fallback for
  transient failures), stamps `textExpandedAt` only when GraphQL authoritatively
  settled the question, and stamps `quotedTweetFailedAt` on permanent failures
  — killing both the "runs forever" loop and the silent no-op path.
- CLI wires `--browser` / `--cookies` / profile options through to `syncGaps`.

Tests: 10 new, covering feature-flag fix, new fetcher, gap-fill expand-and-mark,
idempotence skip, syndication-preview regression, transient-vs-permanent marking.
266/266 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* bump v1.3.9

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Accidentally dropped validateCookie() during the auth_token required-check
refactor. Restore printable-ASCII validation and whitespace trimming for
both ct0 and auth_token before building the cookie header.

Addresses review feedback on PR afar1#101.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants