Block PRs that modify multiple agent directories by PunchTheDev · Pull Request #219 · PunchTheDev/forge

PunchTheDev · 2026-06-03T13:47:17Z

Attack vector being closed

A malicious miner could submit a PR that modifies both agents/their-agent/agent.py AND agents/someone-elses/agent.py. CI would evaluate whichever path sorts first, post a green result, and on merge silently overwrite the victim's agent in main. The victim's next merge would then run with corrupted code.

Fix

In the Find changed agent step (eval.yml), count unique agents/*/ directories touched by the PR. If more than one, exit 1 before any eval runs:

ERROR: PR modifies agent.py in multiple directories:
agents/alice-agent
agents/victim-agent
Each PR must touch exactly one agent directory.

The existing template exclusion (grep -v '^agents/template/') is preserved. Non-agent files in the same PR (e.g. a README.md inside the agent directory) are unaffected — only agent.py files are counted.

Test plan

PR touching one agents/*/agent.py → CI proceeds normally
PR touching two or more agents/*/agent.py → CI exits 1 with clear error before building Docker image
PR touching only non-agent files in agents/** → CI skips eval (found=false, unchanged behavior)

🤖 Generated with Claude Code

Three independent fixes identified in scale readiness audit: 1. run_eval_pool.py: distinguish container crash (returncode != 0, no output) from bad JSON (container ran but output is garbage). Previously both showed "Invalid JSON output" — crash now shows "Container exited 137" with stderr tail, making OOM kills and segfaults debuggable by miners. 2. record_submissions.py: skip STEP files smaller than 200 bytes. The file is pre-created as 0 bytes before docker run so the container can write to it; if the container crashes mid-run the file stays empty. Storing an empty BLOB sets has_step=true for a submission with no geometry, breaking the 3D viewer for that entry. 3. score.yml: increase score-round timeout-minutes from 90 → 150. 15 specs × ~180s each + Docker overhead ≈ 50 min per round; 90 min was dangerously close to the limit for slower specs under high load. eval.yml and hidden-eval remain at 90 min (3 specs each — sufficient). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

A malicious PR could include changes to agents/alice/agent.py alongside agents/bob/agent.py. CI would eval the first alphabetically, but on merge both files land in main — silently overwriting another miner's agent. Fix: in the 'Find changed agent' step, count unique agent subdirectories touched by the PR. If more than one, exit 1 with a clear error message before any eval runs. Each PR must touch exactly one agents/* directory. The template directory is already excluded from eval. Non-agent file changes (README, spec.txt) in the same PR are still allowed as before. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Punch and others added 7 commits June 3, 2026 13:14

Clamp baseline geometry to build volume, cover load point Z

29aab0e

Leave 2mm margin from build volume boundary in baseline

df509db

Fix model IDs in CONTRIBUTING, add llm-agent to examples list

78b6193

Serialize score.yml to prevent concurrent DB-write overload

ca1a7c6

Add session-2 changelog entries

97b274c

PunchTheDev merged commit 9c9178a into main Jun 3, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Block PRs that modify multiple agent directories#219

Block PRs that modify multiple agent directories#219
PunchTheDev merged 7 commits into
mainfrom
punch/block-cross-agent-pr-modification

PunchTheDev commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PunchTheDev commented Jun 3, 2026

Attack vector being closed

Fix

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant