Skip to content

gnovak/remote-dev-bot

Repository files navigation

Remote Dev Bot

An AI-powered development workflow where GitHub issues get resolved autonomously via pull requests — like having a remote colleague who checks GitHub between surf sessions. Using shell-based agents feels like pair programming, which is a valuable mode of collaboration, but sometimes you want something that feels more like delegating work to an experienced coworker. This system aims to provide that alternative.

To install, say to your AI agent:

claude "Read https://raw.githubusercontent.com/gnovak/remote-dev-bot/main/install.md and follow it to set up remote-dev-bot for my repo {owner}/{repo}. This is my first time — walk me through each step, explain what's happening, and ask before doing anything."

There are already excellent vendor-specific implementations of this pattern (GitHub Copilot Workspace, Cursor, etc.), so this project isn't necessarily better than those. However, it's intentionally cross-platform and was built as a learning exercise — a way to understand the agent tooling space and explore how to design agents that can autonomously handle real development tasks.

How It Works

  1. Create a GitHub issue describing a feature or bug
  2. Comment /agent-resolve (or /agent-resolve-claude-large, etc.) to trigger implementation
  3. A GitHub Action spins up an AI agent that runs a custom LiteLLM agent loop that:
    • Reads the issue and codebase
    • Implements the requested changes
    • Opens a draft PR
  4. Review the PR. If changes are needed, comment /agent-resolve on the PR with feedback for another pass.

Or use /agent-design to explore the codebase and get a design analysis posted as a comment (no code changes).

Or use /agent-review on a pull request to get an AI code review posted as a comment (no code changes).

See it in action:

Commands

Command What it does
/agent-resolve Resolve the issue and open a PR (default model)
/agent-resolve-claude-large Resolve with a specific model
/agent-design Explore codebase and post a design analysis comment
/agent-design-claude-large Design analysis with a specific model
/agent-review Post a code review comment on a PR (no code changes)
/agent-review-claude-large Code review with a specific model
/agent-workshop[-<model>] Design analysis + multi-model council critique
/agent-build[-<model>] Implement issue + multi-model council code review

Modes and model aliases are configured in remote-dev-bot.yaml.

Mobile-friendly syntax: Commands are case-insensitive and you can use spaces instead of dashes: /agent resolve claude large works the same as /agent-resolve-claude-large.

Per-Invocation Arguments

You can override settings for a single run by adding arguments on lines after the command:

/agent resolve
max iterations = 75
branch = feature/my-branch
extra_files = extra-context.md
Argument Type Description
max iterations integer Override the iteration limit for this run
timeout minutes integer Override the watchdog timeout in minutes for this run
branch string Target branch for the PR (default: main)
extra files list Additional context files for the agent to read (space-separated)
bash output limit integer Max bash output chars kept (first half + last half, middle dropped; default: 8000)

Argument names are flexible: max iterations, max-iterations, and max_iterations all work.

For tuning and observability options (status_log_interval, context_keep_tool_results, etc.) see debug.md.

Understanding Model Names

Model aliases (like claude-small) map to LiteLLM model identifiers in remote-dev-bot.yaml. Remote Dev Bot uses LiteLLM to talk to different LLM providers through a unified interface.

Model ID format: provider/model-name

Provider Prefix Example
Anthropic (Claude) anthropic/ anthropic/claude-sonnet-4-5
OpenAI (GPT) openai/ openai/gpt-5.1-codex-mini
Google (Gemini) gemini/ gemini/gemini-2.5-flash

Supported Providers

Remote Dev Bot currently supports three LLM providers out of the box:

Provider Secret Name Model Prefix
Anthropic (Claude) ANTHROPIC_API_KEY anthropic/
OpenAI (GPT) OPENAI_API_KEY openai/
Google (Gemini) GEMINI_API_KEY gemini/

The workflow automatically selects the correct API key based on the model prefix. For example, a model ID starting with anthropic/ will use ANTHROPIC_API_KEY.

Adding a new provider: LiteLLM supports many providers beyond these three. To add support for a new provider, you'll need to modify .github/workflows/remote-dev-bot.yml:

  1. Add the new secret to the workflow_call.secrets section
  2. Add a case in the "Determine API key" step to match the provider prefix
  3. Pass the secret to the environment in the resolve and design jobs

See the LiteLLM providers documentation for the full list of supported providers and their model prefixes.

Finding Valid Model Names

The model strings must be valid LiteLLM identifiers. Browse available models at models.litellm.ai — search by name, filter by provider, and see context windows and pricing.

Prefix the model string with the provider name in remote-dev-bot.yaml (e.g., anthropic/claude-sonnet-4-5).

Choosing a Model

For most tasks: Use the default (/agent-resolve). Claude Sonnet (claude-small) offers a good balance of capability and cost.

For complex multi-file features: Use /agent-resolve-claude-large (Opus) or /agent-resolve-gpt-large (GPT Codex). These models handle larger contexts and more intricate reasoning.

For coding-heavy tasks: Models with "codex" in the name (e.g., openai/gpt-5.1-codex-mini) are specifically tuned for code generation and may perform better on implementation tasks.

Commit Trailers

To have the agent sign its commits (e.g. with the model name), add an instruction to your AGENTS.md:

Sign all commits with a trailer: Model: <your model name and version>

There is no built-in trailer — commit message format is entirely up to you.

Workshop and Build Modes

Workshop and Build are council modes: after the agent finishes its primary task, each model in the configured council independently reviews the result and posts a structured comment. Human reads the council's feedback, then decides what to do next.

Workshop (/agent-workshop):

  • Stage 1: one agent explores the codebase and produces a design proposal (same as /agent-design)
  • Stage 2: each council model independently critiques the proposal and posts a structured review comment on the issue
  • Bot pauses — human reads the critiques and replies, then can trigger /agent-design for a revised proposal or /agent-build to implement

Build (/agent-build):

  • Stage 1: one agent implements the issue and opens a PR (same as /agent-resolve)
  • Stage 2: each council model reviews the PR diff and posts a code review comment on the PR
  • Bot pauses — human reviews the code reviews and decides whether to merge

Both modes use a configurable council: list in the mode config. If omitted, the council defaults to all configured models.

Architecture

The system has two parts:

  • Shim workflow (.github/workflows/agent.yml) — a thin trigger that lives in each target repo. Fires on /agent- commands and calls the reusable workflow. Copy this file to set up the shim install.
  • Reusable workflow (.github/workflows/remote-dev-bot.yml) — all the logic: parses commands, dispatches to resolve, design, or review mode, runs the agent. Lives in this repo and is called by shims in target repos.
  • LiteLLM agent loop (lib/resolve.py) — the custom agent that does the actual code exploration and editing
  • remote-dev-bot.yaml — model aliases and agent settings (max iterations, PR type)
  • install.md — step-by-step setup instructions, designed to be followed by a human or by an AI assistant (like Claude Code)

Setup

See install.md for complete setup instructions. It's designed so you (or an AI assistant) can follow it step-by-step to get this running in your own GitHub account.

Quick version: You need a GitHub repo, API keys for your preferred LLM provider(s), and about 10 minutes. No PAT or special authentication is required — the bot works with GitHub's built-in token and posts as github-actions[bot].

Advanced auth options: If you want bot PRs to auto-trigger CI, or a custom bot identity (e.g., your-app[bot]), see the advanced auth section in install.md. Options include a GitHub App (recommended) or a PAT.

Customization

Add Repo Context for the Agent

Create an AGENTS.md or CLAUDE.md in your target repo with anything the agent should know about your codebase: coding conventions, architecture overview, how to run tests, directories to avoid, etc. Add the file to extra_files in your remote-dev-bot.yaml so the agent reads it before starting work.

An AI assistant can write this for you — just ask it to read your codebase and generate an AGENTS.md describing the architecture and conventions.

It's also worth noting project maturity and constraints: whether backward compatibility matters, whether there are external users, whether data can be regenerated from scratch. Agents use this to calibrate how much weight to put on migration paths, API stability, and similar concerns. For example: "This is a pre-alpha prototype with no external users; backward compatibility and data migration are non-concerns."

Model Aliases

Add or modify model aliases in your repo's remote-dev-bot.yaml (create it in the repo root if it doesn't exist):

models:
  my-alias:
    id: anthropic/claude-sonnet-4-5
    description: "My custom model"

These settings layer on top of the base config in remote-dev-bot.yaml from the remote-dev-bot repo. Your repo's settings take precedence. See how-it-works.md for config layering details.

Iteration Limits

The agent runs for up to 50 iterations by default. Lower this for simpler repos (less cost, faster results) or raise it for complex tasks:

agent:
  max_iterations: 30

When the Agent Can't Fully Resolve an Issue

Sometimes the agent judges that it couldn't completely fix the issue (it reports success=False in its evaluation). The on_failure setting controls what happens next:

Value Behaviour
draft (default) Posts the agent's evaluation comment and opens a draft PR with whatever changes were made. Also opens a draft PR if the agent exhausts its iteration budget or fails mid-run with committed work on the branch.
comment Posts a comment with the agent's evaluation and a link to the run logs. No PR is created.
agent:
  on_failure: comment # post a comment only, no draft PR

Use the default draft to preserve partial work for review and completion. Set comment if you prefer a comment-only failure mode and don't want a draft PR created.

Other Configuration Options

agent:
  # Target branch for PRs (default: main)
  branch: main

  # Assign the triggering user to the issue when the agent starts (default: true)
  assign_issue: true

  # Assign the triggering user to the resulting PR (default: true)
  assign_pr: true

  # Watchdog timeout in minutes — kills the agent after this many minutes
  # so cost report and artifact upload steps still run (default: 120)
  timeout_minutes: 120

  # Bash output truncation limit in characters. Outputs longer than this are
  # trimmed to the first 4k + last 4k chars (middle dropped) to prevent context
  # bloat. The agent is told how many chars were dropped. Set to 0 to disable.
  # (default: 8000)
  bash_output_limit: 8000

  # Number of recent tool call/result pairs kept in context. Older pairs are
  # replaced with a placeholder to prevent O(N²) token growth on long runs.
  # Set to 0 to keep all results. (default: 10)
  context_keep_tool_results: 10

You can also override max_iterations, branch, and context on a per-invocation basis without editing the config file — see Per-Invocation Arguments.

Council Configuration (Workshop and Build)

The workshop: and build: mode entries accept a council: list that controls which models participate in the review stage. If omitted, all configured models are used.

modes:
  workshop:
    council:
      - claude-small
      - gpt-small
  build:
    council:
      - claude-small
      - gemini-small

Security

Who Can Trigger the Agent

The workflow only runs when someone with OWNER, COLLABORATOR, or MEMBER role on the repository posts a /agent- comment. Anonymous users, first-time contributors, and external users cannot trigger agent runs — even on public repos.

This is controlled by GitHub's author_association field, which the workflow checks before starting any agent work. You can adjust who is allowed by editing the SECURITY_GATE marker in your workflow file.

What the Agent Can Do

The agent has significant access to your repository:

  • Reads all files — code, configuration, documentation, secrets referenced in the codebase
  • Creates branches and pushes commits
  • Opens draft pull requests and posts comments

The agent's file access is scoped to the repository. It cannot access other repositories or organization-level secrets beyond those explicitly passed as workflow secrets.

Prompt Injection Risk

The agent reads issue and PR comment text as instructions. Malicious issue text could attempt to manipulate the agent — for example, asking it to commit secrets, exfiltrate data, or do something unrelated to the issue. This is called prompt injection.

Built-in mitigations:

  • Collaborator gating: Only people you've explicitly granted repo access to can trigger agent runs. An external attacker who can write an issue cannot trigger the agent unless they're already a collaborator.
  • Security microagent: The workflow includes hardened system prompt instructions (visible at the SECURITY_GATE marker) that tell the agent to refuse requests to exfiltrate secrets, modify CI pipelines, or take other unauthorized actions.
  • Runner isolation: The agent runs bash directly on the GitHub Actions runner VM. GitHub-hosted runners are ephemeral (discarded after each run) and isolated per-job. The runner environment does not persist between runs. The LLM API key is the main piece of value visible to the agent — the same would be true of any approach that must pass the key to make API calls.

Recommendations

  • Review all agent PRs before merging. Agent-created branches are draft PRs by default — treat them as you would code from any external contributor.
  • Use branch protection rules. Require PR reviews on main so no agent-created branch can be merged without a human sign-off.
  • Don't put secrets in issue bodies. The agent reads issues; sensitive data in issue text can appear in agent logs or commit messages.
  • Audit the SECURITY_GATE policy in your workflow file if you want to further restrict who can trigger the agent (e.g., OWNER-only on sensitive repos).

Troubleshooting

Getting a second PR instead of a revision

You probably commented on the original issue instead of the PR. Commenting on the issue always creates a new PR; commenting on the PR adds commits to the existing one. Check which page you're on before triggering.

(The two-PR behavior is also intentional when you want to compare different model implementations — trigger from the issue twice with different model aliases.)

Cost showing $0.00

The workflow couldn't capture token usage data from this run. Check the Actions log for the run — look at the "Calculate and post cost" or "Post cost comment" step to see what was found.

Agent triggered but no PR appeared

The agent ran, posted a comment with its evaluation, but didn't open a PR. This means the agent judged that it couldn't fully resolve the issue — it hit the iteration limit, got confused, or determined its changes were incomplete.

Try a more capable model (/agent-resolve-claude-large) or add more detail to the issue description. The agent's evaluation comment will say what it attempted and why it stopped.

To receive a draft PR with whatever partial changes the agent made, set agent.on_failure: draft in your remote-dev-bot.yaml (see When the Agent Can't Fully Resolve an Issue).

Diagnosing failures with an interactive agent: The fastest way to understand what went wrong is to ask an AI coding assistant to read the logs for you:

Have a look at issue 50 — I triggered the agent but it didn't make a PR. Look through the Actions logs and tell me what went wrong.

Or point at a specific run ID from the Actions tab:

Have a look at Actions run 12345678 in this repo. What went wrong?

The assistant can fetch the logs via gh run view, identify the failure point, and suggest a fix.

Other issues

See the Troubleshooting section in install.md for installation-related problems (workflow not triggering, secrets not reaching the workflow, etc.).

LLM Provider Quick Reference

Dashboard, billing, and API key management links for each supported provider.

Anthropic (Claude)

OpenAI (GPT)

Google (Gemini)

  • API keys · Usage & rate limits · Projects
  • Google AI Studio is the simplest way to manage Gemini API keys. It's a lightweight frontend to the same API available through Google Cloud Console.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors