docs: add behavior control positioning with safety evidence by bmdhodl · Pull Request #348 · bmdhodl/agent47

bmdhodl · 2026-04-14T13:58:08Z

Summary

Add "Why Static Guards" section to README with three recent safety findings (one peer-reviewed, two from preview/preprint)
Position AgentGuard as behavior control (not just cost control)
Rule-based guards can't be socially engineered by the models they guard

Data points cited

Mythos Preview (April 2026) - found vulnerabilities in every major OS/browser, triggered government emergency meeting
Nature (2026) - (peer-reviewed) evidence of LLMs disabling oversight, scheming, leaving hidden notes
War games (arXiv 2602.14740) - GPT-5.2, Claude Sonnet 4, Gemini 3 Flash showed spontaneous deception, 0% surrender, nuclear escalation

Test plan

All 672 existing tests pass
No SDK code changes (docs only)
PyPI README regenerated and in sync
Claims backed by cited sources

🤖 Generated with Claude Code

AgentGuard leads with cost control today. Three recent data points (Mythos Preview government emergency, Nature peer-reviewed deception evidence, arXiv war games nuclear escalation) validate the thesis that static rule-based guards are the correct architecture for agent safety. This adds a "Why static guards" section making that case. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector · 2026-04-14T13:58:27Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Copilot

Pull request overview

Adds a new documentation section positioning AgentGuard’s static, deterministic guards as “behavior control” and introduces supporting safety-related evidence in the README surfaces (GitHub + PyPI).

Changes:

Add a new “Why static guards” section describing behavior-control framing and deterministic guard benefits.
Cite three safety-related evidence points (Mythos Preview, a Nature paper, and an arXiv war-games preprint) in both README.md and the generated PyPI README.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`README.md`	Adds a new “Why static guards” section to reframe the product and list safety evidence.
`sdk/PYPI_README.md`	Updates the generated PyPI README to include the same new section content.

Address Copilot review on PR #348: - Add (source) links to Mythos Preview, Nature, and arXiv war games citations - Remove '(arXiv 2602.14740)' inline ref in favor of explicit link - Apply to both README.md and sdk/PYPI_README.md

Copilot AI review requested due to automatic review settings April 14, 2026 13:58

Copilot started reviewing on behalf of bmdhodl April 14, 2026 13:58 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

Comment thread sdk/PYPI_README.md Outdated

Comment thread README.md Outdated

Comment thread README.md Outdated

docs: add source citations to safety evidence bullets

7455dfc

Address Copilot review on PR #348: - Add (source) links to Mythos Preview, Nature, and arXiv war games citations - Remove '(arXiv 2602.14740)' inline ref in favor of explicit link - Apply to both README.md and sdk/PYPI_README.md

bmdhodl merged commit fe6c5b6 into main Apr 14, 2026
12 checks passed

bmdhodl deleted the feat/safety-narrative-behavior-control branch April 14, 2026 22:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: add behavior control positioning with safety evidence#348

docs: add behavior control positioning with safety evidence#348
bmdhodl merged 2 commits intomainfrom
feat/safety-narrative-behavior-control

bmdhodl commented Apr 14, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bmdhodl commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Data points cited

Test plan

Uh oh!

chatgpt-codex-connector bot commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bmdhodl commented Apr 14, 2026 •

edited

Loading