Feat/autoimprove on skill selector by kulvirgit · Pull Request #186 · AltimateAI/altimate-code

kulvirgit · 2026-03-16T18:45:28Z

Summary

What changed and why?

Test Plan

How was this tested?

Checklist

Tests added/updated
Documentation updated (if needed)
CHANGELOG updated (if user-facing)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Restore .trim() on models API JSON to prevent syntax error in generated models-snapshot.ts - Fix archive path for scoped package names (@altimate/cli-*) in release tarball/zip creation - Remove gh release upload from build.ts (handled by github-release job) - Add CHANGELOG.md entry for v0.1.5 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

- Redesign M as 5-wide with visible V-valley to distinguish from A - Change E top from full bar to open-right, distinguishing from T - Fix T with full-width crossbar and I as narrow column - Fix D shape in CODE - Render CODE in theme.accent (purple) instead of theme.primary (peach) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- publish.ts: change glob from `*/package.json` to `**/package.json` to find scoped package directories (@altimate/cli-*) which are 2 levels deep - release.yml: add skip-existing to PyPI publish so it doesn't fail when the engine version hasn't changed between releases Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…(#15780)

…ider (#15619)

@Altimate

The npm org is @AltimateAI, not @Altimate. Update all package names, workspace dependencies, imports, and documentation to use the correct scope so npm publish succeeds. Name mapping: - @altimate/cli → @altimateai/altimate-code - @altimate/cli-sdk → @altimateai/altimate-code-sdk - @altimate/cli-plugin → @altimateai/altimate-code-plugin - @altimate/cli-util → @altimateai/altimate-code-util - @altimate/cli-script → @altimateai/altimate-code-script Also updates publish.ts to emit the wrapper package as @altimateai/altimate-code (no -ai suffix) and hardcodes the bin entry to altimate-code. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Two issues: 1. TypeScript permission-task tests: test fixture wrote config to `opencode.json` but the config loader only looks for `altimate-code.json`. Updated fixture to use correct filename. 2. Python tests: `pytest: command not found` because pyproject.toml had no `dev` optional dependency group. Added `dev` extras with pytest and ruff. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: rename opencode references to altimate-code in all test files Update test files to use the correct names after the config loader was renamed from opencode to altimate-code: - `opencode.json` → `altimate-code.json` - `.opencode/` → `.altimate-code/` - `.git/opencode` → `.git/altimate-code` - `OPENCODE_*` env vars → `ALTIMATE_CLI_*` - Cache dir `opencode` → `altimate-code` - Schema URL `opencode.ai` → `altimate-code.dev` Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve remaining test failures and build import issue - Fix build.ts solid-plugin import to use bare specifier for monorepo hoisting - Update agent tests: "build" → "builder", "plan" → "analyst" for disabled fallback - Fix well-known config mock URL in config.test.ts - Fix message-v2 test: "OpenCode" → "Altimate CLI" - Fix retry.test.ts: replace unsupported test.concurrent with test - Fix read.test.ts: update agent name to "builder" - Fix agent-color.test.ts: update config keys to "builder" - Fix registry.test.ts: remove unpublished plugin dep from test fixture - Skip adding plugin dependency in local dev mode (installDependencies) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address Sentry review comments and Python CI deps - Update theme schema URL from opencode.ai to altimate-code.dev (33 files) - Rename opencode references in ACP README.md and AGENTS.md docs - Update test fixture tmp dir prefix to altimate-code-test- - Install warehouse extras in Python CI for duckdb/boto3 test deps Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Python CI — SqlGuardResult allows None data, restrict pytest to tests/ - Allow SqlGuardResult.data to be None (fixes lineage.check Pydantic error) - Set testpaths = ["tests"] in pyproject.toml to exclude src/test_local.py from pytest collection (it's a source module, not a test) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve ruff lint errors in Python engine - Remove unused imports in server.py (duplicate imports, unused models) - Remove unused `json` import in schema/cache.py - Remove unused `os` import in sql/feedback_store.py - Add noqa for keyring availability check import Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Use import.meta.resolve to find the @opentui/core package directory instead of hardcoding node_modules path, which fails with monorepo hoisting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…aming - Build: output binary as altimate-code instead of opencode - Bin wrapper: look for @altimateai/altimate-code-* scoped packages - Postinstall: resolve @AltimateAI scoped platform packages - Publish: update Docker/AUR/Homebrew refs to AltimateAI/altimate-code - Publish: make Docker/AUR/Homebrew non-fatal (infra not set up yet) - Dockerfile: update binary paths and names Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: Jérôme Benoit <jerome.benoit@piment-noir.org>

…sion (#15762) Co-authored-by: Test User <test@test.com> Co-authored-by: Shoubhit Dash <shoubhit2005@gmail.com>

Post-merge now runs build, test, and typecheck before declaring success. Stops on build failure; warns on test/typecheck failures (since some may be pre-existing). Co-Authored-By: Kai (Claude Opus 4.6) <noreply@anthropic.com>

chore: merge upstream opencode v1.2.20

- Add null guard to `NamedError.isInstance()` preventing `TypeError` on null input - Add Azure OpenAI overflow detection patterns (`the request was too long`, `maximum tokens for requested operation`) - Fix token estimation tests to use new content-aware ratio (3.7) - Fix compaction/prune config tests: `altimate-code.json` → `opencode.json` - Fix install test fixtures: `@opencode-ai/opencode` → `@altimateai/altimate-code` - Fix `bin/altimate-code` entry in `package.json` to point to dedicated wrapper - Fix test import paths for relocated modules (`bridge/client`, `bridge/engine`, `tool/project-scan`) - Add synthetic pending emission + bash output streaming to ACP agent event handler - Add `pendingEmitted` tracking set and clear bash snapshots on pending state Co-Authored-By: Kai (Claude Opus 4.6) <noreply@anthropic.com>

fix: resolve all test failures from fork restructure

`@gitlab/opencode-gitlab-auth` depends on `@opencode-ai/plugin@1.2.10` (npm) while the workspace has `@opencode-ai/plugin@1.2.20`. The `_HeyApiClient` protected property differs between versions, causing a TS2322 error. Cast to `PluginInstance` since the types are structurally compatible at runtime. Co-Authored-By: Kai (Claude Opus 4.6) <noreply@anthropic.com>

fix: resolve GitlabAuthPlugin type incompatibility

* chore: use ARC self-hosted runner for CI * chore: use ARC self-hosted runner for release workflow * fix: harden ARC runner migration with security and reliability safeguards - Fall back to `ubuntu-latest` for fork PRs to prevent untrusted code execution on self-hosted ARC runners - Add `timeout-minutes: 60` to all CI and release jobs (self-hosted runners have no default timeout unlike GitHub-hosted 6h limit) - Write `NPM_TOKEN` to `$RUNNER_TEMP/.npmrc` instead of `~/.npmrc` to prevent secret persistence on self-hosted runners - Set `NPM_CONFIG_USERCONFIG` to point publish step to temp `.npmrc` - Add `concurrency` group to CI workflow to cancel superseded runs --------- Co-authored-by: anandgupta42 <anand@altimate.ai>

…ws) (#94)

ci: switch Windows tests to ARC self-hosted runner

…ck required tools)

Co-Authored-By: Kai (Claude Opus 4.6) <noreply@anthropic.com>

…onfig fix: restore TUI crash after upstream merge

…rkflow

fix: correct TEAM_MEMBERS ref from 'dev' to 'main' in pr-standards workflow

- Add `AltimateApi` client for datamate CRUD and integration resolution - Add `datamate` tool with 9 operations: list, show, create, update, delete, add (MCP connect), remove (MCP disconnect), list-integrations, status - Extract shared MCP config utilities (`resolveConfigPath`, `addMcpToConfig`, `removeMcpFromConfig`, `listMcpInConfig`) to `mcp/config.ts` - Add `/datamate-setup` skill for guided datamate onboarding - Register datamate tool in tool registry and TUI sync context - Add test suite for `AltimateApi` credential loading and API methods

feat: datamate manager — dynamic MCP server management

…tion - Add `skill-selector.ts` with `selectSkillsWithLLM()` — uses small model (Haiku 4.5 via `Provider.getSmallModel`) + `generateObject` to semantically select relevant skills based on user message and project fingerprint - Remove `partitionByFingerprint()` and `rescueByMessage()` from `tool/skill.ts` - Remove `tags` field from `Skill.Info` schema and all 51 SKILL.md files - Add 2 new dbt-labs skills: `migrating-dbt-core-to-fusion`, `migrating-dbt-project-across-platforms` - Rewrite tests with `SkillSelectorDeps` DI interface for model + generate mocking - Graceful fallback: returns all skills on timeout (3s), error, no model, or zero selection

Full benchmark infrastructure for evaluating altimate-code against the Spider 2.0-DBT leaderboard (68 dbt+DuckDB tasks). Includes setup, runner, evaluator, and reporting.

- Fix `sql_validate` → `altimate_core_validate` across all agent prompts - Improve analyst schema discovery and builder output discipline - Comment out non-builder agents for benchmark focus - Add retry limits to prevent infinite retry loops in session processor - Remove stale IMPLEMENTATION_PLAN.md

- Delete `altimate-setup` skill definition - Remove "Available Skills" section from `builder.txt` prompt - Skills were not improving benchmark performance (26.47% vs 32.35% baseline)

Instrument `altimate run` with Langfuse observability (activated via LANGFUSE_* env vars) so benchmark runs capture tool calls, tokens, cost, and timing as traces. After evaluation, traces are updated with benchmark_pass scores for end-to-end visibility in the Langfuse dashboard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Generations previously had empty input (no context) and empty output when the model only produced tool calls. Now: - Input shows tool results from the preceding step - Output falls back to "[tool calls: read, write, ...]" when no text Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Rewrite builder prompt to reflect data engineer workflow (understand data → write → build → verify) instead of QA checklists - Remove cosmetic tools (format, grade, generate-tests, FinOps) that don't affect correctness - Deduplicate tool mentions (each tool listed once) - Patch deprecated dbt config keys (source-paths → model-paths, data-paths → seed-paths) in prepare_workspace to suppress warnings Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The v4 SDK renamed `score()` to `create_score()`. Use `create_score` with fallback to `score` for backwards compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When the builder agent declares it's done, automatically inject a verification prompt into the same conversation. The verification step checks dbt build status, column order against YAML specs, and row counts against source data — fixing issues before the session ends. Also adds benchmark prompt improvements: column ordering guidance, date range bounds, row count verification, staging model preference, final build enforcement, and mandatory data inspection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

This reverts commit bd6eeeb.

dev-punia-altimate · 2026-03-17T14:15:01Z

✅ Tests — All Passed

TypeScript — passed

Python — passed

_{Tested at 9cb7d6f7 | Run log | Powered by QA Autopilot}

anandgupta42 and others added 30 commits March 2, 2026 17:26

release: v0.1.5

500f0f2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

release: v0.1.6

340ce8f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

release: v0.1.7

0a08668

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(opencode): clone part data in Bus event to preserve token values …

fd6f713

…(#15780)

fix(provider): forward metadata options to cloudflare-ai-gateway prov…

96d6fb7

…ider (#15619)

chore: generate

e41b535

release: v0.1.8

dc61802

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(opencode): avoid gemini combiner schema sibling injection (#15318)

7e3e85b

chore: generate

9f150b0

release: v0.1.9

614c180

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: resolve @opentui/core parser.worker.js via import.meta.resolve

8738470

Use import.meta.resolve to find the @opentui/core package directory instead of hardcoding node_modules path, which fails with monorepo hoisting. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

wip: zen

6aa4928

chore: generate

881ca86

wip: zen

1233ebc

wip: zen

b985ea3

zen: docs

6deb27e

release: v0.1.10

391f365

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: nix flake update for bun 1.3.10 (#15648)

48412f7

Co-authored-by: Jérôme Benoit <jerome.benoit@piment-noir.org>

fix(opencode): disable session navigation commands when no parent ses…

18850c4

…sion (#15762) Co-authored-by: Test User <test@test.com> Co-authored-by: Shoubhit Dash <shoubhit2005@gmail.com>

fix(app): timeline jank

5e8742f

fix(app): timeline jank

e4af1bb

chore: fix test

1e2da60

chore: cleanup

7305fc0

kulvirgit and others added 28 commits March 6, 2026 17:25

fix: add build, test, and typecheck steps to merge script

89b6c44

Post-merge now runs build, test, and typecheck before declaring success. Stops on build failure; warns on test/typecheck failures (since some may be pre-existing). Co-Authored-By: Kai (Claude Opus 4.6) <noreply@anthropic.com>

Merge pull request #87 from AltimateAI/merge/upstream-v1.2.20

a6e1f2e

chore: merge upstream opencode v1.2.20

Merge pull request #91 from AltimateAI/fix/test-failures-v2

473523e

fix: resolve all test failures from fork restructure

Merge pull request #92 from AltimateAI/fix/typecheck-plugin-type

1a00dd9

fix: resolve GitlabAuthPlugin type incompatibility

chore: migrate test.yml to ARC runners (linux) + GitHub-hosted (windo…

7d02c3e

…ws) (#94)

ci: switch Windows test jobs to ARC self-hosted runner

8fc3d95

Merge pull request #95 from AltimateAI/feat/arc-windows-runner

c726206

ci: switch Windows tests to ARC self-hosted runner

ci: revert Windows tests to windows-latest (ARC Windows containers la…

34524f0

…ck required tools)

fix: restore TUI crash after upstream merge

c113580

Co-Authored-By: Kai (Claude Opus 4.6) <noreply@anthropic.com>

Merge pull request #98 from AltimateAI/fix/restore-branding-and-tui-c…

5740133

…onfig fix: restore TUI crash after upstream merge

fix: correct TEAM_MEMBERS ref from 'dev' to 'main' in pr-standards wo…

ffd76bc

…rkflow

Merge pull request #101 from AltimateAI/fix/pr-standards-workflow

4ae73c0

fix: correct TEAM_MEMBERS ref from 'dev' to 'main' in pr-standards workflow

Merge pull request #99 from AltimateAI/feat/datamate-manager-clean

9a02f27

feat: datamate manager — dynamic MCP server management

feat: add Spider2-DBT benchmark harness

d367036

Full benchmark infrastructure for evaluating altimate-code against the Spider 2.0-DBT leaderboard (68 dbt+DuckDB tasks). Includes setup, runner, evaluator, and reporting.

chore: remove skills from builder agent prompt

94b0249

- Delete `altimate-setup` skill definition - Remove "Available Skills" section from `builder.txt` prompt - Skills were not improving benchmark performance (26.47% vs 32.35% baseline)

improvemnets 1

a29b4d1

fix: support Langfuse Python SDK v4 score API (create_score)

cfdc056

The v4 SDK renamed `score()` to `create_score()`. Use `create_score` with fallback to `score` for backwards compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Revert "feat: add auto-verification pass for builder agent"

9cb7d6f

This reverts commit bd6eeeb.

anandgupta42 force-pushed the main branch from c86970e to d097682 Compare March 17, 2026 00:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/autoimprove on skill selector#186

Feat/autoimprove on skill selector#186
kulvirgit wants to merge 10000 commits intomainfrom
feat/autoimprove-on-skill-selector

kulvirgit commented Mar 16, 2026

Uh oh!

dev-punia-altimate commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

kulvirgit commented Mar 16, 2026

Summary

Test Plan

Checklist

Uh oh!

dev-punia-altimate commented Mar 17, 2026

✅ Tests — All Passed

TypeScript — passed

Python — passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants