Skip to content

feat(semble): upgrade to v0.4.1, flatten result parsing, localize status messages#734

Open
navedmerchant wants to merge 4 commits into
mainfrom
feat/semble-v0.4.1-upgrade
Open

feat(semble): upgrade to v0.4.1, flatten result parsing, localize status messages#734
navedmerchant wants to merge 4 commits into
mainfrom
feat/semble-v0.4.1-upgrade

Conversation

@navedmerchant

@navedmerchant navedmerchant commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Related GitHub Issue

Closes: #733

Description

Upgrades the bundled semble code-search binary from v0.3.1 to v0.4.1 and adapts the provider to semble v0.4.0+'s flattened JSON output format. Also localizes all SembleProvider status messages and hardens the downloader against stale archives left over from prior versions.

Key implementation details:

  • Version bump & checksums: SEMBLE_VERSION v0.3.1 → v0.4.1 with refreshed SEMBLE_SHA256 for linux-x64, linux-arm64, darwin-arm64, and win32-x64.
  • Flat result parsing: semble v0.4.0+ no longer wraps each result in a chunk object. Removed the SembleChunk interface and flattened SembleSearchResult to { file_path, start_line, end_line, score, content? }. Updated SembleProvider result conversion and SembleCLI parsing/docs accordingly. content is now optional (omitted when semble is invoked with --max-snippet-lines 0).
  • Stale-archive safety: the local archive cache path is now version-prefixed (${SEMBLE_VERSION}-${archive}) so a leftover archive from a previous version can never be reused or verified against the new checksum. A partial archive is also unlinked before re-downloading. Added cleanupStaleArchives to best-effort sweep orphaned archives after a successful install.
  • i18n: moved hard-coded English status strings out of SembleProvider into embeddings.json semble.* keys (downloadingBinary, ready, unsupportedPlatform, downloadFailed, checkFailed, providerReset) across all 18 locales.
  • Tests: updated provider.spec.ts, semble-cli.spec.ts, and semble-downloader.spec.ts for the flat result shape and version-prefixed cache path.

Reviewers should pay attention to: the flattened SembleSearchResult shape (breaking type change, but semble-internal), the version-prefixed archive cache path, and the new semble.* i18n keys across locales.

Test Procedure

  • Unit tests (run from the src workspace):
    • cd src && npx vitest run services/code-index/semble/__tests__/provider.spec.ts
    • cd src && npx vitest run services/code-index/semble/__tests__/semble-cli.spec.ts
    • cd src && npx vitest run services/code-index/semble/__tests__/semble-downloader.spec.ts
  • Lint and type-check already pass via the pre-commit/pre-push hooks (turbo lint, turbo check-types).
  • Manual verification (optional): trigger a fresh semble download on a clean global storage dir and confirm the version-prefixed archive is created and stale archives from the prior version are removed.

Pre-Submission Checklist

  • Issue Linked: This PR is linked to approved issue [ENHANCEMENT] Upgrade semble to v0.4.1 and localize provider status messages #733.
  • Scope: My changes are focused on the linked issue (semble upgrade + localization).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: Updated unit tests cover the new flat result shape and version-prefixed cache path.
  • Documentation Impact: I have considered documentation updates (none required — internal provider change).
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

N/A — no UI changes; status strings are surfaced via existing system-state infrastructure.

Documentation Updates

  • No documentation updates are required.

Additional Notes

The SembleChunk export was removed from index.ts; confirm no external consumers depend on it (it is semble-internal).

Get in Touch

Summary by CodeRabbit

  • New Features

    • Added localized Semble status and error messages across multiple languages.
    • Updated Semble search/indexing to support the newer flattened results format.
  • Bug Fixes

    • Improved Semble initialization and state transitions using translated user-facing messages (with error details).
    • Fixed result normalization for file paths, line ranges, and snippet content when present.
    • Reduced leftover cached archive clutter during Semble version upgrades and stale cleanup.
  • Tests

    • Updated Semble CLI/provider/downloader tests to match the v0.4.0+ flattened schema and new version behavior.

…tus messages

- Bump SEMBLE_VERSION v0.3.1 -> v0.4.1 and refresh SEMBLE_SHA256 checksums
- Adapt to semble v0.4.0+ flat JSON output (no chunk wrapper); remove
  SembleChunk and flatten SembleSearchResult (file_path, start_line,
  end_line, score, content)
- Update provider.ts and semble-cli.ts parsing accordingly
- Version-prefix the local archive cache path so a stale archive from a
  previous semble version is never reused or verified against the new
  checksum; remove partial archive before re-downloading
- Add cleanupStaleArchives to best-effort sweep orphaned archives after
  a version upgrade
- Localize SembleProvider status strings via i18n embeddings:semble.*
  keys across all 18 locales
- Update provider/cli/downloader unit tests for the new shape and
  version-prefixed cache path

Closes #733
@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d4782017-0456-4eb1-8d1e-1093598d05c8

📥 Commits

Reviewing files that changed from the base of the PR and between c817271 and 34eea6a.

📒 Files selected for processing (23)
  • src/i18n/locales/ca/embeddings.json
  • src/i18n/locales/de/embeddings.json
  • src/i18n/locales/en/embeddings.json
  • src/i18n/locales/es/embeddings.json
  • src/i18n/locales/fr/embeddings.json
  • src/i18n/locales/hi/embeddings.json
  • src/i18n/locales/id/embeddings.json
  • src/i18n/locales/it/embeddings.json
  • src/i18n/locales/ja/embeddings.json
  • src/i18n/locales/ko/embeddings.json
  • src/i18n/locales/nl/embeddings.json
  • src/i18n/locales/pl/embeddings.json
  • src/i18n/locales/pt-BR/embeddings.json
  • src/i18n/locales/ru/embeddings.json
  • src/i18n/locales/tr/embeddings.json
  • src/i18n/locales/vi/embeddings.json
  • src/i18n/locales/zh-CN/embeddings.json
  • src/i18n/locales/zh-TW/embeddings.json
  • src/services/code-index/semble/__tests__/provider.spec.ts
  • src/services/code-index/semble/__tests__/semble-downloader.spec.ts
  • src/services/code-index/semble/index.ts
  • src/services/code-index/semble/provider.ts
  • src/services/code-index/semble/semble-downloader.ts
✅ Files skipped from review due to trivial changes (12)
  • src/i18n/locales/zh-TW/embeddings.json
  • src/i18n/locales/nl/embeddings.json
  • src/i18n/locales/fr/embeddings.json
  • src/i18n/locales/pl/embeddings.json
  • src/i18n/locales/en/embeddings.json
  • src/i18n/locales/ko/embeddings.json
  • src/i18n/locales/zh-CN/embeddings.json
  • src/i18n/locales/it/embeddings.json
  • src/i18n/locales/hi/embeddings.json
  • src/i18n/locales/pt-BR/embeddings.json
  • src/i18n/locales/ru/embeddings.json
  • src/i18n/locales/tr/embeddings.json
🚧 Files skipped from review as they are similar to previous changes (8)
  • src/i18n/locales/es/embeddings.json
  • src/i18n/locales/ca/embeddings.json
  • src/i18n/locales/ja/embeddings.json
  • src/i18n/locales/vi/embeddings.json
  • src/i18n/locales/de/embeddings.json
  • src/i18n/locales/id/embeddings.json
  • src/services/code-index/semble/provider.ts
  • src/services/code-index/semble/tests/provider.spec.ts

📝 Walkthrough

Walkthrough

Semble is upgraded to v0.4.1, its search result schema is flattened across types, CLI parsing, provider conversion, and tests, provider status messages move to locale-backed i18n strings, and the downloader now uses version-prefixed archives with stale-cache cleanup.

Changes

Semble v0.4.1 upgrade

Layer / File(s) Summary
Flattened result schema
src/services/code-index/semble/types.ts, src/services/code-index/semble/index.ts, src/services/code-index/semble/semble-cli.ts, src/services/code-index/semble/provider.ts, src/services/code-index/semble/__tests__/semble-cli.spec.ts, src/services/code-index/semble/__tests__/provider.spec.ts
Public result types and parsing now use flat file_path/start_line/end_line/content fields instead of chunk wrappers, and the related provider and CLI tests were updated for the new shape.
Translated provider status messages
src/services/code-index/semble/provider.ts, src/i18n/locales/*/embeddings.json, src/services/code-index/semble/__tests__/provider.spec.ts
SembleProvider now sources initialization and state text from i18n, and the locale embedding files add a new semble namespace with localized status and error strings.
Versioned archive cache
src/services/code-index/semble/semble-downloader.ts, src/services/code-index/semble/__tests__/semble-downloader.spec.ts
The downloader bumps Semble to v0.4.1, uses version-prefixed cache files, removes stale archives before and after install, and updates the matching downloader tests and fixtures.

Sequence Diagram(s)

sequenceDiagram
  participant downloadSemble
  participant fs.promises
  participant https
  participant cleanupStaleArchives

  downloadSemble->>fs.promises: delete stale cached archive and inspect storageDir
  downloadSemble->>https: fetch v0.4.1 archive
  downloadSemble->>fs.promises: extract binary and write .semble-version
  downloadSemble->>cleanupStaleArchives: remove orphaned versioned archives
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

awaiting-review

Suggested reviewers

  • taltas
  • hannesrudolph
  • JamesRobert20

Poem

A bunny hopped through cached old logs,
then found new paths past checksum fogs.
🐰 Flat results now leap in line,
and every locale says “all fine.”
Hop-hop—Semble shines!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the version bump, result flattening, and localization work.
Description check ✅ Passed The description includes the linked issue, implementation details, testing steps, checklist, and review notes.
Linked Issues check ✅ Passed The changes match issue #733: v0.4.1 bump, flat-result parsing, i18n status strings, and stale-archive cleanup.
Out of Scope Changes check ✅ Passed The modified files and tests all support the Semble upgrade and localization goals without obvious unrelated additions.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/semble-v0.4.1-upgrade

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@codecov

codecov Bot commented Jun 27, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 95.83333% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/services/code-index/semble/provider.ts 92.30% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/services/code-index/semble/__tests__/semble-downloader.spec.ts`:
- Around line 639-679: The stale-archive cleanup path in downloadSemble is not
actually being exercised because the test only verifies removal of the new
v0.4.1 cache file while readdir is still empty. Update the
semblle-downloader.spec test around downloadSemble to simulate an existing
prior-version archive (v0.4.0-*) in the cache and assert that it is deleted
during the upgrade flow, using the same versioned archive path logic already
referenced by versionedArchive and fs.unlink so the cleanup regression is
covered.

In `@src/services/code-index/semble/semble-downloader.ts`:
- Around line 116-127: The cleanupStaleArchives routine only matches names
ending with "-${archiveName}", so it misses the legacy unversioned cache file
name used before v0.4.0. Update the filtering logic in cleanupStaleArchives to
also recognize the exact unversioned archive name for the current archive, while
still excluding currentArchivePath and preserving the existing versioned suffix
cleanup behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ba37b79d-4e4c-46f6-b7e1-a242d398b312

📥 Commits

Reviewing files that changed from the base of the PR and between 6705e67 and c817271.

📒 Files selected for processing (26)
  • src/i18n/locales/ca/embeddings.json
  • src/i18n/locales/de/embeddings.json
  • src/i18n/locales/en/embeddings.json
  • src/i18n/locales/es/embeddings.json
  • src/i18n/locales/fr/embeddings.json
  • src/i18n/locales/hi/embeddings.json
  • src/i18n/locales/id/embeddings.json
  • src/i18n/locales/it/embeddings.json
  • src/i18n/locales/ja/embeddings.json
  • src/i18n/locales/ko/embeddings.json
  • src/i18n/locales/nl/embeddings.json
  • src/i18n/locales/pl/embeddings.json
  • src/i18n/locales/pt-BR/embeddings.json
  • src/i18n/locales/ru/embeddings.json
  • src/i18n/locales/tr/embeddings.json
  • src/i18n/locales/vi/embeddings.json
  • src/i18n/locales/zh-CN/embeddings.json
  • src/i18n/locales/zh-TW/embeddings.json
  • src/services/code-index/semble/__tests__/provider.spec.ts
  • src/services/code-index/semble/__tests__/semble-cli.spec.ts
  • src/services/code-index/semble/__tests__/semble-downloader.spec.ts
  • src/services/code-index/semble/index.ts
  • src/services/code-index/semble/provider.ts
  • src/services/code-index/semble/semble-cli.ts
  • src/services/code-index/semble/semble-downloader.ts
  • src/services/code-index/semble/types.ts

Comment thread src/services/code-index/semble/__tests__/semble-downloader.spec.ts
Comment thread src/services/code-index/semble/semble-downloader.ts
@navedmerchant navedmerchant marked this pull request as draft June 27, 2026 01:35
…e sweep

CodeRabbit PR #734 follow-ups:

- cleanupStaleArchives now also matches the exact unversioned archive name
  (pre-v0.4.0 cache layout) in addition to the version-prefixed suffix, so a
  v0.3.1 -> v0.4.1 upgrade also clears the legacy file. The current archive
  path is still preserved.
- semble-downloader.spec: the version-upgrade test now simulates a prior-version
  archive (v0.4.0-*) and a legacy unversioned archive in the cache, and asserts
  both are swept during the upgrade flow.
- Add coverage for the cleanupStaleArchives catch block (readdir rejects) and
  for the current-archive/unrelated-file preservation behavior.
- provider.spec: add a test that rejects search with a non-Error value to cover
  the 'error instanceof Error' false branch of the telemetry payload (stack:
  undefined), raising provider.ts patch coverage.
…essage

Export SEMBLE_VERSION from semble-downloader and re-export from the semble
barrel. SembleProvider now interpolates the active version into the
'embeddings:semble.ready' system-state message (set in both _doInitialize and
startIndexing), so the CodeIndexPopover status line shows e.g.
'Indexed - Semble v0.4.1 is ready. Searches index on-the-fly.'

- semble-downloader.ts: export SEMBLE_VERSION.
- index.ts: re-export SEMBLE_VERSION.
- provider.ts: pass { version: SEMBLE_VERSION } to t('embeddings:semble.ready').
- i18n: update semble.ready across all 18 locales to include {{version}}.
- provider.spec.ts: mock SEMBLE_VERSION, update the ready-message mock to
  interpolate version, and add a test asserting the version appears in the
  ready status message.
@navedmerchant navedmerchant marked this pull request as ready for review June 27, 2026 02:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENHANCEMENT] Upgrade semble to v0.4.1 and localize provider status messages

1 participant