Trustabl is a static analyzer for agent reliability. It parses an agent-SDK repository (Claude Agent SDK, OpenAI Agents SDK, Google ADK, MCP), models the tools, agents, subagents, skills, slash commands, and plugin manifests it declares, and checks them against a catalog of reliability and safety rules. It reports the weaknesses it finds — each with an explanation, a suggested fix, and a confidence score — as a human-readable summary, JSON, or SARIF 2.1.0, plus a per-surface reliability score and a CI-friendly exit code. It ships as a single Go binary; there is no daemon, server, or hosted service.
The rest of this document explains what Trustabl reasons about and how the scan works, then covers building and running it. For the full implementation reference see ARCHITECTURE.md; for the at-a-glance SDK coverage matrix see COVERAGE.md.
Trustabl does not treat a repository as one undifferentiated blob. Every rule is classified into exactly one of four scopes, and each scope receives a different typed input:
tool— fires once per tool definition. Input: aToolDef(a@function_tool/@tool/@claude_toolfunction, a Claude TStool(name, description, schema, handler)factory call, aFunctionTool(fn)ADK wrapper, an@server.toolMCP registration, or a bare shell-invoking function) plus its parsed file. Catches a missing docstring, an HTTP call with no timeout, untyped parameters, or an unnormalized path flowing intoopen(). (Hosted tools likeWebSearchTool()are agent-scope edge data, captured asHostedToolDef, notToolDef.)agent— fires once per agent declaration. Input: anAgentDef— a PythonAgent(...)/SandboxAgent(...)/AgentDefinition(...)call, a Claude TS typed-constAgentDefinition, a Claude TS sub-agent inline inoptions.agents, or the Claude TSquery(...)main-thread agent (QueryMainAgent) — with every constructor kwarg captured and its edges to tools, handoffs, and guardrails resolved. Catches an agent with shell tools and noinput_guardrails,tool_use_behavior="stop_on_first_tool"paired with filesystem-touching tools, or a main-thread agent with unrestrictedallowedTools.subagent— fires once per Claude Code subagent markdown declaration. Discovery is hybrid: canonical.claude/agents/*.md(any path depth, monorepo-safe) PLUS a frontmatter-shape fallback over all markdown files (gated onname+tools/model) that catches flat-collection repos which ship subagents undercategories/*.md,plugins/<x>/agents/*.md, or similar layouts. Input: aSubagentDefparsed from frontmatter —name,description,tools[](verbatim) +ToolGrants[](parsed permission grammar),disallowedTools,model,permissionMode(incl.bypassPermissions),mcpServers,skills,isolation,hasHooks. Catches a subagent granted the built-inBashtool despite a read-only description (CSDK-110). Subagent presence alone contributesclaude_agent_sdktoSDKsDetected, so the Claude pack loads and CSDK-110 fires on pure-markdown subagent collections.repo— fires once per scan against the whole inventory. Catches project-wide gaps such as the OpenAI Agents SDK being present with no custom trace processor configured.
A repo can declare zero, one, or many agents, across one or more SDKs. Two agents in the same repo can be in completely different security postures — one wired with input/output guardrails, the other not. Agent-scoped findings therefore attribute to a specific agent at its constructor call site; flattening them to a single repo-level verdict would lose that attribution and be wrong. Discovery builds a small per-repo graph (tools, agents, subagents, and the edges between them) so agent-scope and subagent-scope rules can query it.
A Claude-SDK rule and an OpenAI-Agents-SDK rule that detect the same
conceptual problem (a missing timeout, say) are two separate rules with
SDK-specific explanation and fix text — there is no cross-SDK casting.
When a repo declares agents from multiple SDKs side by side, each agent is
checked only against the rules for the SDK that declared it. The same
holds across languages: a language: python rule will not fire on a
TypeScript agent.
trustabl scans in four steps. Each step's output is the typed input to the next, with no shared state between runs — and the inventory the early steps build is what makes policy selection data-driven rather than statically configured.
The binary ships with no embedded rules. Before the pipeline runs,
Trustabl resolves its detection rules from a separate git repository
(trustabl-rules) —
fetching the latest, caching the clone locally, and falling back to the
cache when the network is unreachable. This decouples rule updates from
binary releases: rules can be added or changed without rebuilding the
scanner. The resolved rules commit is recorded in the result and folded
into the ScanID, so a scan is honest about which rules produced it.
If no rules can be fetched and none are cached, the scan exits 2 and
tells you to run trustabl rules pull — Trustabl never runs rule-less.
flowchart LR
target[("Agent repo<br/>(local path or GitHub URL)")]
recon["Recon<br/>files · SDK deps"]
inv["Inventory<br/>Python + TS AST:<br/>tools · agents ·<br/>subagents · MCP servers"]
pol["Policy selection<br/>load rules per<br/>detected SDK ·<br/>META findings"]
ana["Analysis<br/>tool · agent · subagent ·<br/>repo detectors"]
score["Scoring<br/>per-surface score ·<br/>overall readiness"]
out[("ScanResult<br/>findings · scores<br/>(human / JSON / SARIF)")]
target --> recon --> inv --> pol --> ana --> score --> out
- Recon — walk the repo and answer "what's in here" cheaply, without
parsing any source language: languages present (by extension), SDK
dependencies declared in manifests (
pyproject.toml/requirements.txt/Pipfile/poetry.lock/package.jsonfor theclaude-agent-sdk/@anthropic-ai/claude-agent-sdk/openai-agents/@openai/agents/google-adk/@google/adkneedles), the file inventory, and discovered agent components (MCP configs, hook scripts,CLAUDE.mdandAGENTS.mdguidance docs,.claude/agents/*.mdsubagents at any depth,SKILL.mdskills, slash commands at both.claude/commands/*.mdand<plugin-root>/commands/*.md,.claude-plugin/{plugin,marketplace}.jsonmanifests, sandbox policies). No tree-sitter parses happen here — this step decides whether the expensive AST work is even worth attempting. - Inventory — for each language Recon cleared, do the AST work and
extract a typed inventory:
ToolDefs with their config and body facts,AgentDefs with all kwargs captured,SubagentDefs /SkillDefs /SlashCommandDefs /PluginManifests parsed from markdown and JSON frontmatter,MCPServerDefs, guardrails, sessions, and the resolved edges between agents and the tools/guardrails they reference. Detectors read fields off these structs — they never re-parse raw source. - Policy selection — load only the rule packs for SDKs actually
observed in code. An SDK seen in code with no shipped pack emits a
META-001info finding ("Trustabl does not currently audit this SDK") — silence on an unknown SDK is wrong. A dep declared but never used in code emits a different info finding flagging the drift. - Analysis — run the selected scope-aware detectors against the inventory. Findings carry the scope they fired at and attribute to the right location: tool file/line, agent call site, subagent markdown file, or the manifest.
Three properties fall out of this staging, by design:
- Performance. A repo with no Python skips Python AST work; a repo with only Claude TS code skips Python AST work AND OpenAI policy loading.
- Honest coverage. An "unaudited SDK" info finding is louder than a
zero-findings clean bill of health on an SDK Trustabl doesn't know. A
META-004finding further distinguishes "audited and clean" from "could not audit — discovery extracted nothing a rule targets." - Determinism is a contract. Same inputs → same
ScanID, and the report is byte-stable across runs (findings sorted by(RuleID, FilePath, Line), inventory slices sorted deterministically). CI consumers can diff scans without spurious churn.
See ARCHITECTURE.md § 2 for the full diagram with typed inputs at each step.
Tool/agent AST discovery is wired for:
- Python — Claude Agent SDK (decorators), OpenAI Agents SDK, Google ADK. Discovery extracts tool definitions, agent constructors, hosted tools, MCP servers, guardrails, sessions.
- TypeScript — Claude Agent SDK (the
tool()factory, thequery()main-threadQueryMainAgent, inline-in-query()sub-agents, typed-constAgentDefinitions,createSdkMcpServerand the fouroptions.mcpServersconfig literals), OpenAI Agents SDK (thetool({...})factory,new Agent({...})andAgent.create({...}), 9 hosted-tool factories, MCP server classes across 3 transports plus theMCPServerswrapper, 4defineXguardrail factories, and theMemorySession/OpenAIConversationsSession/OpenAIResponsesCompactionSessionsession classes — gated on imports from@openai/agents,@openai/agents-core, or@openai/agents-openai), and Google ADK (thenew FunctionTool({...})constructor, 5 agent constructors —new LlmAgent({...})/SequentialAgent/ParallelAgent/LoopAgent/RoutedAgent— 13 hosted-tool classes, andsubAgentsedges — gated on imports from@google/adk). Handles.ts/.tsx/.mts/.ctswith bothtree-sitter-typescriptandtree-sitter-tsxgrammars. TypeScript rule packs ship for all three SDKs: Claude Agent SDK (CSDK-010/011/012/013/014/016 tool rules; CSDK-120/130/131 agent rules), OpenAI Agents SDK (OAI-016/017/019/022/024 tool rules; OAI-105 agent rule), and Google ADK (ADK-013/015/016 tool rules; ADK-109 agent rule). A TS repo for any of the three no longer produces a blanketMETA-004; seeCOVERAGE.mdfor the full matrix.
JavaScript and Go files are recognized by Recon (they appear in the
file inventory and feed component discovery) but no AST parser for them
is wired in, so no tools or agents are extracted from them. The rule
schema's language: field is in place for when those parsers ship.
- LLM enrichment is opt-in. Rule-based detection runs fully without a
key and makes no network call without one. Use
trustabl llm key setto configure a provider key (~/.config/trustabl/keys.json, mode 0600);internal/inference/router.gois the BYOK interface the enrichment path will call. - Confidence scores are heuristic, not LLM-judged, and not yet calibrated against a labelled real-agent corpus — treat findings as signal to investigate.
- The CLI is the surface. No web app, API server, or GitHub Action —
pipe
--format jsonor--format sarifinto your own automation.
Trustabl is a detect-and-report tool: it does not write or modify any
files in the scanned repo. Each run produces a ScanResult containing:
- Findings — one per rule hit, each with
severity,confidence, anexplanation, asuggested_fix, and the location it fired at (tool file/line, agent call site, subagent file, or the manifest). - Per-surface readiness scores (one per discovered tool, agent, subagent, or the repo as a whole) and an overall score (a breadth-aware, badness-weighted mean — weak surfaces pull it down harder, but a single poor surface does not zero it; the score is a triage signal, not the CI gate).
- The discovered inventory — tools, agents, hosted tools, MCP servers, subagents, skills, slash commands, plugin manifests, and Claude settings — surfaced at the top level for CI consumers.
The human format honestly separates the three things people commonly conflate:
Tool definitions: 2 (custom tools with function bodies — scored below)
Agent tool grants: 14 (tool names the agent may call — audited by agent-scope rules)
Hosted tools: 1 (...)
Only the "Tool definitions" category flows through tool-scope rules (they have function bodies a rule can read). Agent grants and hosted instances are inputs to agent-scope rules, not unanalyzed — they just don't appear in the per-surface readiness table.
--format human (default) renders a human summary to stdout and live
progress to stderr — an animated spinner and progress bar on an
interactive terminal, or plain [phase] summary lines when piped
(CI-friendly).
--format json marshals the full ScanResult for piping into your
own automation.
--format sarif emits a SARIF 2.1.0 document, suitable for
github/codeql-action/upload-sarif and other SARIF-aware tools.
--format json and --format sarif are progress-silent and byte-stable
across identical-input runs (pure functions of the ScanResult). The human
format is not byte-stable by design: its ANSI color is auto-detected from the
terminal (TTY vs pipe, NO_COLOR), so the same scan can render with or without
color. Use --no-color, or diff the JSON/SARIF output, when byte-stability
matters.
Exit codes:
0— no findings ≥ medium severity (or no findings at all).1— at least one finding ≥ medium severity, OR--strictwith any finding present.2— scanner / I/O error, OR no usable rules found and none fetchable (runtrustabl rules pull).
OpenShell surfaces are still discovered (shell-invocation functions,
openshell/*.yaml policies) and reported on a Risk surfaces: openshell
block in the human format: the count of shell-invoking functions, the first
three file:line locations (deterministically sorted), a why: line stating
the threat model (a prompt-injected agent that exposes one of these as a
callable tool can run arbitrary commands), and a fix: line with concrete
remediations (sandbox, allowlist, drop shell=True, keep shell logic out
of agent-callable code). The OSH-* detection rules that audited these
surfaces have moved to a closed-source companion project; with no OSH rules
shipped, such repos fire no rule and no META finding — the block makes
the unaudited risk legible without claiming an audit happened. OpenShell is
a risk surface, not an SDK, so it is not flagged as "unaudited" the way an
unknown SDK would be.
brew install trustabl/tap/trustablscoop bucket add trustabl https://github.com/trustabl/scoop-bucket
scoop install trustabldocker run --rm -v "$PWD:/repo" ghcr.io/trustabl/trustabl:latest scan /repoGrab a prebuilt archive for your platform from the
Releases page. Each release
includes a checksums.txt and a build-provenance attestation; verify with:
gh attestation verify <archive> --repo trustabl/trustablRequires CGO_ENABLED=1 because the AST parsers use tree-sitter
(Python + TypeScript + TSX bindings), which is a C library:
# macOS / Linux
CGO_ENABLED=1 go build -o trustabl ./cmd/trustabl
# Cross-compile: pick a C toolchain for the target. zig is the easiest.
CGO_ENABLED=1 CC="zig cc -target x86_64-linux-gnu" \
GOOS=linux GOARCH=amd64 go build -o trustabl-linux ./cmd/trustablThis is the cost of using tree-sitter for accurate AST parsing. If a
single-binary, no-CGO distribution becomes a hard requirement later, the
swap target is github.com/go-python/gpython for Python (with lower
fidelity on modern Python); TypeScript would need a separate replacement.
# Local repo
trustabl scan ./path/to/agent-repo
# GitHub repo (shallow clone to temp dir, removed on exit)
trustabl scan https://github.com/org/repo
# Restrict detectors
trustabl scan ./repo --detectors claude_sdk
trustabl scan ./repo --detectors openai_sdk
trustabl scan ./repo --detectors google_adk
trustabl scan ./repo --detectors claude_sdk,openai_sdk,google_adk
# --detectors openshell is accepted but selects zero rules (pack is closed-source now)
# JSON output for CI piping
trustabl scan ./repo --format json
# SARIF output for GitHub Code Scanning / SARIF-aware tools
trustabl scan ./repo --format sarif > trustabl.sarif
# Exit 1 on any finding regardless of severity
trustabl scan ./repo --strict
# Download / refresh the detection rule packs into the local cache
trustabl rules pull
# Use a custom rules repo or a specific ref (env: TRUSTABL_RULES_REPO)
trustabl scan ./repo --rules-repo https://github.com/org/my-rules
trustabl scan ./repo --rules-ref v1.2.0
# Air-gapped / offline: skip the network fetch, use the cached rules only
trustabl scan ./repo --no-rules-update
# Progress output (human format): animated on a terminal, plain lines when piped
trustabl scan ./repo # spinner + bars on a TTY; "[phase] summary" lines when piped
trustabl scan ./repo --no-progress # disable progress entirely
# Configure LLM provider for enrichment (prerequisite for trustabl enrich)
trustabl llm list # show configured providers with masked keys
trustabl llm key set # prompt securely for an API key
trustabl llm key set sk-ant-api03-... # set key non-interactively
trustabl llm key get # show masked key for active provider
trustabl llm key delete # delete key with confirmation prompt
trustabl llm model set claude-sonnet-4-6 # change model for active providerRules are cached under your OS cache dir (os.UserCacheDir(), e.g.
%LocalAppData%\trustabl\rules\ on Windows, ~/.cache/trustabl/rules/
on Linux). The first scan (or an explicit trustabl rules pull)
populates it; each subsequent scan checks for an update first (unless
--no-rules-update), falling back to the cached rules if the fetch
fails.
This means the resolved rule pack targets a newer rule-schema version than
your Trustabl binary understands (the engine gates packs on
schema_version; see internal/rules/schema_version.go). The fallback to
cached rules can only help when a compatible pack is already cached — if
the only cached pack is the too-new one, the scan exits 2.
Two fixes:
-
Upgrade Trustabl to a build that supports the newer schema (the usual fix — the binary is simply behind the rules repo).
-
Pin an older rules branch or tag whose pack targets a schema your build supports:
trustabl scan ./repo --rules-ref <branch-or-tag>
--rules-refresolves branches and tags only — not raw commit SHAs — so a compatible ref must already exist in the rules repo. If every branch is on the newer schema, tag a known-good older commit there first and pin that tag:# in the rules repo, at the newest commit whose manifest.yaml schema_version # is <= what your build supports (git log -p -- manifest.yaml shows each bump) git tag schema-8 <sha> && git push origin schema-8
--no-rules-updatedoes not help here: it is cache-only, so if your cache already holds the too-new pack the scan still fails.
| Pipeline node | Code path |
|---|---|
| Importer | internal/ingestion/importer.go |
| Normalizer (recon) | internal/ingestion/normalizer.go |
| Discovery (Python AST + markdown/JSON) | internal/analysis/discovery.go, agents.go, hosted_tools.go, mcp_servers.go, adk_agents.go (Python AST); subagents.go, markdown_agents.go, skills.go, slash_commands.go (markdown frontmatter); plugins.go, claude_settings.go (JSON) |
| TypeScript discovery | internal/analysis/ts_discovery.go, ts_agents.go, ts_mcp_servers.go, ts_handler_facts.go, ts_openai_tools.go, ts_openai_agents.go, ts_openai_hosted_tools.go, ts_openai_mcp_servers.go, ts_openai_guardrails.go, ts_openai_sessions.go, ts_adk_tools.go, ts_adk_agents.go, ts_adk_hosted_tools.go, astutil/ts.go |
| Detector runtime | internal/analysis/detectors/ |
| Rule source | internal/rulesource/ (git fetch + cache + schema-version gate) |
| Detector rules | external trustabl-rules repo (tests: testdata/rules-fixture/) |
| Rule engine | internal/rules/{schema,loader,evaluator,predicates,rule_detector}.go |
| Scoring engine | internal/analysis/scoring.go |
| Report renderer | internal/review/diff.go (human), internal/sarif/render.go (SARIF), JSON marshal in cmd/trustabl |
| Inference router | internal/inference/router.go |
| LLM config | internal/llm/ (key storage · masking · validation) |
Rule packs live in the separate trustabl-rules git repository (grouped
{claude_sdk,openai_sdk,google_adk}/), resolved at scan time rather
than embedded in the binary. Naming convention: CSDK-NNN for Claude
Agent SDK rules (CSDK-0xx tool-scope, CSDK-1xx agent + subagent-scope),
OAI-NNN for OpenAI Agents SDK rules, ADK-NNN for Google ADK rules.
See
ARCHITECTURE.md § 2 — steps 3–4 for the
shipped rule table and COVERAGE.md for per-SDK
recognition detail.
testdata/corpus/ holds real-world agent code (Claude SDK demos, OpenAI Agents
SDK demos, Google ADK demos, a TS Claude SDK fixture) — a corpus, not a
controlled fixture, so well-written agents won't trigger most rules and
that's correct. See testdata/corpus/PROVENANCE.md
for upstream sources and licenses of each example. Per-rule fire/silent
correctness lives in internal/rules/policies_test.go; the end-to-end
sweep in internal/scanner/scanner_test.go only asserts the scanner
doesn't crash on real-world inputs. A labelled 20–40 real-agent-repo
corpus is the detection-quality target (see
ARCHITECTURE.md § 10);
the current tests are regression coverage, not detection-quality
measurement.
Join the Trustabl Discord to ask questions, share feedback, and follow development.
Apache-2.0. See LICENSE.
