Skip to content

feat(claude-code): auto-discover plugin marketplace agents#231

Open
leih1219 wants to merge 11 commits into
awslabs:mainfrom
leih1219:claude-plugin-discovery
Open

feat(claude-code): auto-discover plugin marketplace agents#231
leih1219 wants to merge 11 commits into
awslabs:mainfrom
leih1219:claude-plugin-discovery

Conversation

@leih1219

@leih1219 leih1219 commented May 8, 2026

Copy link
Copy Markdown

Summary

Auto-discover agents from enabled Claude Code plugin marketplaces (those
declared in ~/.claude/settings.jsonextraKnownMarketplaces and
enabled in enabledPlugins). Today, agents installed as part of a
Claude Code plugin are invisible to CAO unless the user manually points
agent_dirs.claude_code at the plugin's agents/ directory. This change
makes them discoverable by default.

Motivation

Claude Code is the only supported provider without an auto-populated
default agent directory. constants.py defines:

Q_AGENTS_DIR        = ~/.aws/amazonq/cli-agents
KIRO_AGENTS_DIR     = ~/.kiro/agents
COPILOT_AGENTS_DIR  = ~/.copilot/agents
OPENCODE_AGENTS_DIR = ~/.aws/opencode/agents
# no equivalent for Claude Code

Users who install a Claude Code plugin that ships agents (via the plugin
marketplace mechanism) expect those agents to be selectable from CAO
without further manual setup. This PR closes that gap for the
marketplace-plugin case.

What changed

  • src/cli_agent_orchestrator/utils/agent_profiles.py — adds
    _discover_claude_plugin_agent_dirs(). It reads
    ~/.claude/settings.json, iterates extraKnownMarketplaces, parses
    each marketplace's .claude-plugin/marketplace.json, and for each
    plugin present in enabledPlugins["<plugin>@<marketplace>"] collects
    the plugin's agents/ directory. Integrated into both
    list_agent_profiles() and _read_agent_profile_source() between
    provider directories and extra user-added directories.
  • src/cli_agent_orchestrator/services/settings_service.py — changes
    the default agent_dirs.claude_code value from
    ~/.aws/cli-agent-orchestrator/agent-store to ~/.claude/agents/,
    aligning with Claude Code's native user-level subagent directory.
    Users with a saved agent_dirs.claude_code value in their
    settings.json are unaffected because get_agent_dirs() merges saved
    values over defaults.
  • docs/settings.md — new section documenting plugin-marketplace
    discovery, precedence order, how to disable via removing
    enabledPlugins, and known limitations.
  • CHANGELOG.md — Unreleased entry.

Security & path safety

  • Marketplace-directory containment: plugin_dir.resolve().relative_to(marketplace_root.resolve())
    prevents traversal across marketplace boundaries.
  • Per-file containment (added in round 2, commit b43a1d2):
    _scan_plugin_directory resolves each candidate file and validates
    resolve().relative_to(plugin_root) before adding it. Symlinks inside
    agents/ that point outside the plugin root are rejected with a log
    warning. Scoped narrowly to the claude_plugin discovery path — other
    _scan_directory callers are untouched.
  • All JSON reads (settings.json, marketplace.json) are wrapped with
    JSONDecodeError + OSError handlers. Missing or malformed files
    produce a single log warning and return an empty list.
  • Discovery runs only on the claude_code provider path; other
    providers are untouched.

Round 2 review response (2026-05-12)

Three commits address @anilkmr-a2z's review feedback, plus one test-coverage commit:

  • 6a93db3fix(agent_profiles): strip multiple leading HTML comment blocks. Regex changed from ^<!--.*?-->\s* to ^(?:<!--.*?-->\s*)+ so consecutive leading blocks are stripped in a single pass.
  • b43a1d2fix(agent_profiles): validate per-file containment for plugin agents. Adds _scan_plugin_directory wrapper with resolve() + relative_to() on each file; scope is narrow (claude_plugin only) and a regression guard test pins the scope.
  • 878f315perf(agent_profiles): cache claude plugin discovery with mtime-based invalidation. Module-level cache keyed on settings.json mtime plus each marketplace.json mtime. Automatic invalidation on any tracked-file change.
  • 22e3e92test(agent_profiles): expand coverage for plugin discovery fixes. +7 tests and +2 mocks, pushing agent_profiles.py line coverage from 85% → 92%.

Tests

test/utils/test_claude_plugin_discovery.py now covers:

  • Happy path: enabled plugin with agents/ directory
  • Missing / malformed ~/.claude/settings.json
  • Malformed marketplace.json
  • Plugin source pointing at a non-existent path, or at a file instead of
    a directory
  • Plugin present but not enabled
  • Cross-marketplace name collision (first-wins dedup)
  • Path traversal attempt blocked at marketplace → plugin hop
  • Orphan enabled-plugin entry referencing a missing marketplace
  • Empty enabledPlugins map
  • list_agent_profiles integration: plugin source label, local-beats-plugin dedup
  • _read_agent_profile_source integration: found in plugin dir, raises
    FileNotFoundError when not found anywhere
  • Per-file symlink containment (round 2): symlink inside plugin
    agents/ pointing outside plugin root is rejected with a warning;
    symlinks resolving within the plugin root are accepted
  • Scope regression guard (round 2): symlinks in non-plugin
    directories (e.g., ~/.kiro/agents/) are NOT rejected — the
    narrow-scope decision is pinned
  • Discovery cache (round 2): cache hit avoids recomputation;
    mtime changes on settings.json or any marketplace.json force
    re-discovery; new-marketplace-added and marketplace-disappeared cases;
    explicit _reset_plugin_discovery_cache()

test/utils/test_agent_profiles.py additionally covers multi-block HTML
comment stripping (1, 2, and 3+ leading blocks; multi-line blocks;
non-leading comments not stripped).

Run locally:

pytest test/utils/test_claude_plugin_discovery.py test/utils/test_agent_profiles.py -v

Coverage on agent_profiles.py: 92%.
Full non-infra suite: 1642 passed / 5 skipped.

Backwards compatibility

  • Users with agent_dirs.claude_code saved in
    ~/.aws/cli-agent-orchestrator/settings.json keep their existing
    behavior (saved-over-default merge).
  • Users on the old default silently switch from scanning
    ~/.aws/cli-agent-orchestrator/agent-store to ~/.claude/agents/.
    Since agent-store is already scanned as LOCAL_AGENT_STORE_DIR,
    this is a no-op for agent visibility.
  • _discover_claude_plugin_agent_dirs return type changed from
    List[Path] to List[Tuple[Path, Path]] (adding the marketplace
    root for per-file containment). The function is underscore-prefixed
    and private to agent_profiles.py; all in-source callers are
    updated.
  • Other public function signatures (list_agent_profiles,
    _read_agent_profile_source, load_agent_profile) are unchanged.

Follow-ups (intentionally out of scope)

  • Workspace-level .claude/agents/ discovery (project-local agents)

Happy to open a separate issue for the above if the maintainers prefer tracking it.

Agents enabled via Claude Code plugin marketplaces
(~/.claude/settings.json → extraKnownMarketplaces) are now discoverable by
CAO without requiring a manual agent_dirs.claude_code entry in
~/.aws/cli-agent-orchestrator/settings.json.

Changes:
- utils/agent_profiles.py: add _discover_claude_plugin_agent_dirs() that
  walks enabled marketplaces, validates marketplace.json, and collects
  each plugin's agents/ directory. Integrated into list_agent_profiles()
  and _read_agent_profile_source() scan order.
- services/settings_service.py: change default agent_dirs.claude_code from
  ~/.aws/cli-agent-orchestrator/agent-store to ~/.claude/agents/. Users
  with a saved value are unaffected (saved-over-default merge semantics).
- docs/settings.md: document the new discovery behavior.
- test/utils/test_claude_plugin_discovery.py: unit tests covering happy
  path, empty enabledPlugins, orphan plugin entries, file-vs-dir source
  validation, cross-marketplace name collision, symlink escape, and
  _read_agent_profile_source() plugin integration.
- CHANGELOG.md: Unreleased entry noting discovery + default path change.
@haofeif haofeif requested a review from patricka3125 May 8, 2026 01:41
@haofeif

haofeif commented May 8, 2026

Copy link
Copy Markdown
Contributor

@patricka3125 would you like to review this if you get a chance ?

@codecov-commenter

codecov-commenter commented May 8, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 85.84071% with 16 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@acb78e6). Learn more about missing BASE report.

Files with missing lines Patch % Lines
src/cli_agent_orchestrator/utils/agent_profiles.py 85.84% 16 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #231   +/-   ##
=======================================
  Coverage        ?   92.73%           
=======================================
  Files           ?       65           
  Lines           ?     5548           
  Branches        ?        0           
=======================================
  Hits            ?     5145           
  Misses          ?      403           
  Partials        ?        0           
Flag Coverage Δ
unittests 92.73% <85.84%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@haofeif haofeif added the enhancement New feature or request label May 8, 2026
Lei Huang and others added 3 commits May 8, 2026 17:43
…arsing

Some profile generators (e.g. AIM `plugins install --local`) prepend
<!-- ... --> HTML comment blocks before the YAML frontmatter delimiter.
python-frontmatter requires '---' on the first line, so these profiles
silently parse with empty metadata — causing mcpServers, model, and
allowedTools to be None at runtime.

Add a defensive regex strip at the top of parse_agent_profile_text()
so CAO handles these profiles correctly regardless of upstream generator
behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
plugin_dir = (mkt_path / plugin_source).resolve()
# Path containment check
try:
plugin_dir.relative_to(resolved_mkt)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Individual files still could be outside. Probably should validate individual files too.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in b43a1d2 — added a _scan_plugin_directory wrapper that resolves each candidate file and validates resolve().relative_to(plugin_root) before adding it. Scoped narrowly to the claude_plugin discovery path (not broadening _scan_directory, so other agent sources keep their existing behavior).

Tests covering this:

  • test_symlink_outside_plugin_root_rejected — symlink inside agents/ pointing outside plugin root is skipped with a warning
  • test_symlink_within_plugin_root_accepted — symlinks resolving within the plugin root still work
  • test_read_agent_profile_source_rejects_symlink_escape — covers the second call site end-to-end
  • TestFixAScopeIsNarrow::test_symlink_escape_in_non_plugin_dir_not_blocked — regression guard so the narrow-scope decision is pinned

Comment thread src/cli_agent_orchestrator/utils/agent_profiles.py Outdated
# Strip leading HTML comments before the YAML frontmatter fence.
# Some profile generators (e.g. AIM) prepend <!-- ... --> blocks that
# prevent python-frontmatter from detecting the opening '---' delimiter.
resolved_text = re.sub(r"^<!--.*?-->\s*", "", resolved_text, flags=re.DOTALL)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTML comment stripping regex only removes a single leading comment block. If a generator produces multiple consecutive comments (e.g., ), only the first is stripped and the second still blocks frontmatter detection.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 6a93db3 — changed the regex from ^<!--.*?-->\s* to ^(?:<!--.*?-->\s*)+ so all consecutive leading comment blocks are stripped in a single re.sub call. Linear-time (no backtracking risk — the inner .*?--> is bounded by a literal terminator, the outer + advances sequentially).

Tests:

  • test_multiple_leading_html_comments_stripped — two consecutive blocks
  • test_three_leading_html_comments_stripped — three blocks (edge case)
  • Existing test_leading_html_comment_stripped_before_frontmatter / test_multiline_html_comment_stripped / test_no_comment_passthrough / test_non_leading_comment_not_stripped all still pass

@anilkmr-a2z anilkmr-a2z left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly minor commant except one.

@haofeif

haofeif commented May 10, 2026

Copy link
Copy Markdown
Contributor

@leih1219 can you please help to address the comments ?

Lei Huang and others added 6 commits May 11, 2026 23:15
The ^-anchored regex introduced in 03e6d35 only matched the first leading
HTML comment block. Profiles with multiple consecutive <!-- ... --> blocks
(e.g. AIM-generated profiles with a boilerplate header + per-agent header)
would have only the first stripped, causing frontmatter detection to fail.

Replace the single-match pattern with a non-capturing repeat group
`^(?:<!--.*?-->\s*)+` so all leading comment blocks are stripped in a
single regex call.
Individual files inside a plugin's agents/ directory (e.g. symlinks
pointing outside the marketplace root) were enumerated without per-file
validation. Add _scan_plugin_directory() that wraps _scan_directory with
resolve()+is_relative_to() checks against the marketplace root for each
file entry.

Scope is narrow: only the claude_plugin discovery path uses the new
containment check. Other _scan_directory callers (local store, provider
dirs, extra dirs) are unchanged.

_discover_claude_plugin_agent_dirs() now returns (agents_dir, plugin_root)
tuples so callers can pass the root for containment validation.

Addresses reviewer comment about individual files escaping containment.
…invalidation

_discover_claude_plugin_agent_dirs() is called on every list_agent_profiles()
invocation from the long-running CAO server. Add a module-level cache keyed
on the mtime of settings.json and each marketplace.json so repeated calls
with unchanged filesystem state return instantly.

Cache invalidation is automatic: any mtime change on settings.json triggers
a full re-discovery, and mtime changes on individual marketplace.json files
are also detected.

Expose _reset_plugin_discovery_cache() (underscore-prefixed, test-only) so
tests can isolate from each other. An autouse fixture in the test file
ensures no cross-test cache pollution.

Addresses reviewer comment about repeated filesystem reads on every
invocation.
Tester-agent pass adding 7 new tests and 2 mock additions, filling
scenario-coverage gaps surfaced by the tester task for PR awslabs#231:

Fix C (HTML-comment strip):
- test_three_leading_html_comments_stripped: edge-case for 3+ blocks

Fix A (per-file plugin containment):
- test_read_agent_profile_source_rejects_symlink_escape / accepts_regular_plugin_file:
  cover the second call site (_read_agent_profile_source) end-to-end
- test_symlink_escape_in_non_plugin_dir_not_blocked:
  regression guard asserting the scope stays narrow (claude_plugin only)

Fix B (discovery cache):
- test_reset_plugin_discovery_cache_clears_state: explicit reset-helper test
- test_cache_invalidates_when_new_marketplace_added
- test_cache_invalidates_when_marketplace_json_disappears

Mocks (home-dir leakage in existing tests):
- test_builtin_store_exception_handled: mock _discover_claude_plugin_agent_dirs
- test_non_md_builtin_files_skipped: same

Coverage on agent_profiles.py: 85%% -> 92%%
Impacted test count: 73 -> 80 (with the 2 mock-added tests now stable)
Full suite: 1642 / 1642 non-infra tests pass.
@patricka3125

patricka3125 commented May 14, 2026

Copy link
Copy Markdown
Collaborator

Hi @leih1219 this is a really interesting PR, great work. I have some overall thoughts on the high level direction and approach of this feature that I would like to request clarification on...

I think the motivation needs a sharper compatibility model before we add automatic discovery.

The PR assumes that “Claude Code plugin ships agents” implies “those agents should be selectable from CAO,” but I don’t think that follows yet. A Claude plugin agent is part of a broader Claude Code plugin runtime: plugins can ship agents alongside skills, hooks, MCP/LSP configs, monitors, bin/ executables, and default settings. CAO currently appears to consume only the agent markdown as a CAO agent profile, so an auto-discovered plugin agent may be listed without the surrounding plugin capabilities it was authored to rely on.

There is also a scoping question. Claude has distinct project, user, and plugin subagent scopes, with defined precedence. CAO has local/provider/custom/built-in profile sources and an orchestration-specific runtime model. Before scanning plugin marketplaces by default, I think we should define the relationship between:

  • CAO-native agent profiles
  • Claude user/project agents in .claude/agents and ~/.claude/agents
  • Claude plugin agents under plugin marketplaces

Those are not obviously interchangeable. Their frontmatter and semantics also differ: Claude subagents use fields like tools, disallowedTools, permissionMode, skills, mcpServers, hooks, isolation, etc., while CAO profiles have CAO/provider-specific concepts like provider, allowedTools, mcpServers, and orchestration assumptions. Some fields may be ignored, renamed, or meaningful only inside Claude Code.

Because of that, I’m not convinced automatic plugin-agent discovery is the right first step. It may make CAO list agents that look available but are degraded or incorrect when run outside the Claude plugin runtime. It also does not solve the stated expectation fully: if a plugin agent depends on plugin skills, hooks, scripts, or settings, simply pointing CAO at agents/ does not make the plugin usable in CAO.

I would prefer one of these narrower approaches:

  1. Treat Claude plugin agents as a separate source type with an explicit “experimental/Claude-only” compatibility contract.
  2. Add an explicit import/convert flow that maps Claude subagent fields into CAO profile fields and warns about unsupported plugin dependencies.
  3. Start with documentation/design for the intended relationship between CAO profiles and Claude agents before enabling automatic discovery.
  4. Avoid changing the default Claude agent directory until we know whether installing CAO profiles into ~/.claude/agents pollutes normal Claude Code workflows.

So I’m supportive of improving interop, but I think this PR currently conflates discoverability with compatibility. I’d like to see the compatibility/scoping model clarified before this becomes default behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants