Skip to content

justindobbs/awesome-certified-agents

Awesome Certified Agents

TraceCore spec-v1.0 License: MIT PRs Welcome

A community catalog of autonomous agents and agent bundles certified by passing TraceCore deterministic episode runs gated by GitHub Actions.

This repository currently targets TraceCore 1.1.2 as the public certification baseline.


What is this?

awesome-certified-agents is an "awesome list" with a twist: every entry is a verified case study, not just a link. To appear in this catalog, an agent must pass tracecore run --strict-spec in public CI — producing an immutable, schema-validated artifact as evidence.

GitHub Actions is the public gate. No human approval alone can certify an agent; the workflow must pass first.

For public submissions, the supported baseline is tracecore>=1.1.2, and CI certifies entries against the pinned launch baseline tracecore==1.1.2 for deterministic review.

Start here

Certification tiers

Badge Tier Requirement
Certified Certified ≥1 frozen TraceCore task passing --strict-spec
Certified+ Certified+ ≥3 frozen tasks across ≥2 task suites
Provisional Provisional Metadata merged; CI not yet passing (grace period)

Certified Agents & Bundles

Featured examples for the public launch:

  • Starter single-agent example: test-agent
  • Reference agent example: toy-agent
  • Reference Certified+ example: tri-suite-agent
  • Reference API example: rate-limit-agent
  • Reference operations example: ops-triage-agent
  • Reference bundle example: filesystem-duo-bundle

The catalog now includes certified examples across filesystem, API, operations, and a multi-suite Certified+ entry.

Agent / Bundle Description Framework Suites Tasks Maintainer Certified
tri-suite-agent Multi-task agent that covers filesystem, API, and operations tasks with a single deterministic task-aware policy. pure-python filesystem, api, operations 3 @justindobbs Certified+
filesystem-duo-bundle Two-stage filesystem bundle where a scout agent locates candidate files and an extractor agent commits the recovered API_KEY. pure-python filesystem 1 @justindobbs Certified
ops-triage-agent Deterministic operations agent that triages incident artifacts and recovers ALERT_CODE. pure-python operations 1 @justindobbs Certified
rate-limit-agent Deterministic API agent that retrieves ACCESS_TOKEN while honoring rate limits and transient failures. pure-python api 1 @justindobbs Certified
test-agent Minimal filesystem explorer that proves the certification workflow is wired correctly. pure-python filesystem 1 @justindobbs Certified
toy-agent Cautious filesystem explorer that extracts API_KEY via list_dir → read_file → extract_value → set_output with retry logic. pure-python filesystem 1 @justindobbs Certified

How to get certified

Use a virtual environment (recommended)

Following the FastAPI guidance on creating virtual environments, isolate your TraceCore install before running any commands:

python -m venv .venv            # Windows: use "py -3.12 -m venv .venv" if multiple Python versions
# Windows activation
.venv\Scripts\activate
# macOS / Linux activation
source .venv/bin/activate
  1. Install TraceCore: pip install "tracecore>=1.1.2"
  2. Run your agent: tracecore run --agent my_agent.py --task <task_ref> --seed <seed> --strict-spec
  3. Ingest the latest matching artifact: python scripts/ingest_run_artifact.py agents/<my-agent> --latest --agent agents/<my-agent>/agent.py --task <task_ref>
  4. Create the entry files under agents/<my-agent>/, including agent.py, README.md, runs/<run_id>.json, and metadata.yaml that matches the committed artifact
  5. Validate locally: python scripts/validate_metadata.py agents/<my-agent>/metadata.yaml --show-tier
  6. Open a PR — CI re-runs your agent live and a CODEOWNER reviews the artifacts

Full guide: docs/how_to_submit.md

Before opening a PR, make sure your entry is understandable to an external reviewer with no prior context beyond the submitted files and the TraceCore docs.


Available frozen tasks

Task Suite What the agent must do
filesystem_hidden_config@1 filesystem Discover API_KEY in a constrained filesystem
rate_limited_api@1 api Fetch ACCESS_TOKEN under rate limits and transient errors
rate_limited_chain@1 api Navigate a multi-step handshake + rate limit
deterministic_rate_service@1 api Parse payload templates + rate-limited service
log_alert_triage@1 operations Triage noisy logs, recover ALERT_CODE
config_drift_remediation@1 operations Detect config drift, emit remediation patch
incident_recovery_chain@1 operations Follow a recovery handoff chain, emit RECOVERY_TOKEN
log_stream_monitor@1 operations Poll paginated logs, detect CRITICAL, emit STREAM_CODE
runbook_verifier@1 operations Verify runbook phase order, emit RUNBOOK_CHECKSUM
sandboxed_code_auditor@1 operations Audit a sandbox runtime, emit `ISSUE_ID

All tasks are deterministic given a fixed seed. See TraceCore SPEC_FREEZE.


What --strict-spec verifies

Every certified run artifact includes:

Field Purpose
spec_version Declares tracecore-spec-v1.0 compliance
runtime_identity Name, version, git SHA of the runtime
task_hash SHA-256 of the task harness files
agent_ref Agent module path invoked
artifact_hash Stable SHA-256 of the serialized artifact
budgets Frozen max steps/tool-calls for the episode
wall_clock_elapsed_s Total episode wall time in seconds

Artifacts conform to artifact-schema-v1.0.json and are immutable once committed.


Contribute


Use a badge in your own repo

[![Awesome Certified](https://img.shields.io/badge/Awesome%20Certified-brightgreen?style=flat-square&logo=checkmarx)](https://github.com/awesome-certified-agents/awesome-certified-agents)

License

MIT — see LICENSE.

Powered by TraceCore — deterministic episode runtime for autonomous agents.

About

A community catalog of autonomous agents and bundles certified by passing TraceCore deterministic episode runs in public CI

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages