Claude/implement p1 guardrails dc og i by PetrAnto · Pull Request #266 · cloudflare/moltworker

PetrAnto · 2026-02-18T13:38:28Z

No description provided.

debug: add granular logging around response parsing

The DO was dying during response.text() - after receiving headers (200) but while streaming the response body from DeepSeek. Added 5-second heartbeat interval during body reading to: - Keep DO active during slow response streaming - Update lastUpdate to prevent watchdog false triggers - Log progress to diagnose slow responses https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

fix: add heartbeat during response.text() to prevent DO death

The DO keeps dying during response.text() before any heartbeat fires. This suggests Cloudflare is hard-killing the DO, not just timing out. Changes: - Heartbeat every 2s instead of 5s during body reading - Add 30s timeout wrapper around response.text() - Checkpoint every 3 tools instead of 5 (less lost progress) If the timeout fires, we'll see an error. If DO still dies silently, the issue is Cloudflare terminating the process entirely. https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

fix: more aggressive heartbeat and timeout during response.text()

Root cause found: DeepSeek API sends HTTP 200 headers but then hangs during response body streaming. The 30s timeout catches this. Added: - Retry loop with up to 3 attempts for API calls - Automatic retry on response.text() timeout - 2 second delay between retries - Logging to track retry attempts This should make the bot much more resilient to DeepSeek's occasional response streaming hangs. https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

fix: add retry logic for DeepSeek API timeouts

- DeepSeek V3.2 is latest version with GPT-5 class reasoning - Routes through OpenRouter instead of direct API (more reliable) - Same cheap pricing: $0.25/$0.38 per 1M tokens - Avoids streaming hang issues seen with direct DeepSeek API https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

Fixes response.text() hang issue with DeepInfra-routed models (Qwen3 Coder, etc.) Changes: - Add chatCompletionStreamingWithTools() method to OpenRouterClient - Uses SSE streaming (stream: true) to read response incrementally - 30s idle timeout with AbortController for clean cancellation - Accumulates tool_call deltas by index - Returns same ChatCompletionResponse structure as non-streaming - stream_options.include_usage for token tracking - Update TaskProcessor to use streaming for OpenRouter provider - Non-OpenRouter providers keep existing fetch-based approach - Progress callback updates watchdog every 50 chunks - Retry logic preserved (3 attempts) Why streaming fixes the hang: - Non-streaming: response.text() waits for entire body, can hang indefinitely - Streaming: reads small chunks incrementally, detects stalls via idle timeout https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

Claude/review merge conflicts yv ug x

Without this, if fetch() hangs before returning a response, the idle timeout never starts and we wait for the 90s watchdog. Now: - 60s timeout on initial fetch (before streaming starts) - 30s idle timeout during streaming (resets on each chunk) - Better error messages: "connection timeout" vs "idle timeout" https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

fix: add 60s timeout on initial fetch for streaming

Root cause (from Grok research): - Cloudflare Workers aggressively pool outbound connections - After many requests to same host, pooled connections become stale - Reusing stale connection causes fetch() to hang indefinitely - AbortController doesn't reliably interrupt stuck pooled connections Fix: - Add unique `_nc` query param to each request URL - This forces potentially new connections, bypassing stale pool - Tradeoff: ~100-300ms extra latency per call (new TLS handshake) - Benefit: Eliminates hangs entirely in most cases https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

fix: add unique query param to bypass stale connection pooling

- Increased idle timeout from 30s to 45s per Grok's analysis - Added diagnostic info (model ID, content length) to timeout errors - Note: iteration 10 hang was likely caused by version rollout during test https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

fix: increase streaming idle timeout to 45s for network resilience

AbortController only affects fetch(), not subsequent reader.read() calls. When the stream hangs mid-read, the abort signal doesn't interrupt it. Now each reader.read() is wrapped in Promise.race with a 45s timeout, ensuring mid-stream hangs are properly detected and trigger retries. https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

fix: use Promise.race timeout on reader.read() for mid-stream hangs

The "task stopped unexpectedly" message was misleading users by suggesting CPU issues. Updated to correctly indicate API timeouts or network issues, and prompt them to tap Resume. https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

- Add autoResume flag to TaskState and TaskRequest - Implement auto-resume in alarm handler (up to 10 attempts) - Add /automode (or /auto) command to toggle the setting - Show auto-resume status in /status command - Update error message to mention API timeouts instead of CPU When enabled, tasks automatically resume on timeout instead of requiring manual "Resume" button tap. Useful for long-running tasks with intermittent API timeouts. https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

Claude/review merge conflicts yv ug x

When resuming from checkpoint, the model would re-read rules and re-acknowledge the task instead of continuing implementation. This adds a [SYSTEM RESUME NOTICE] message to the conversation when loading a checkpoint, instructing the model to skip the acknowledgment phase and continue directly with implementation. Root cause: The skill prompt says "read rules and acknowledge", and the model follows that instruction on every resume. https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

fix: add resume instruction to break re-acknowledgment loop

Auto-resume was failing for direct provider models (DeepSeek, DashScope, Moonshot) because the API keys weren't stored in TaskState and weren't passed to the reconstructed TaskRequest. Now stores dashscopeKey, moonshotKey, deepseekKey in TaskState and passes them through during auto-resume. https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

fix: store direct API keys for auto-resume recovery Auto-resume was failing for direct provider models (DeepSeek, DashScope, Moonshot) because the API keys weren't stored in TaskState and weren't passed to the reconstructed TaskRequest. Now stores dashscopeKey, moonshotKey, deepseekKey in TaskState and passes them through during auto-resume. https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

- Replace invalid deepchimera (deepseek-r1t2-chimera) with deepfree (deepseek-r1:free) - Replace invalid mimo (xiaomi/mimo-v2) with nemofree (mistral-nemo:free) - Fix devstral to use mistralai/devstral-small:free (valid free model) - Fix grok to use x-ai/ prefix instead of xai/ - Fix grokcode to x-ai/grok-code-fast-1 - Fix flash to google/gemini-3-flash-preview - Fix geminipro to google/gemini-3-pro-preview - Fix mistrallarge to mistralai/mistral-large-2512 Added new models: - qwencoderfree: qwen/qwen3-coder:free (480B MoE free coding model) - llama70free: meta-llama/llama-3.3-70b-instruct:free - trinitymini: arcee-ai/trinity-mini:free (fast reasoning) - devstral2: mistralai/devstral-2512 (paid premium coding) https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

fix: update invalid OpenRouter model IDs - Replace invalid deepchimera (deepseek-r1t2-chimera) with deepfree (deepseek-r1:free) - Replace invalid mimo (xiaomi/mimo-v2) with nemofree (mistral-nemo:free) - Fix devstral to use mistralai/devstral-small:free (valid free model) - Fix grok to use x-ai/ prefix instead of xai/ - Fix grokcode to x-ai/grok-code-fast-1 - Fix flash to google/gemini-3-flash-preview - Fix geminipro to google/gemini-3-pro-preview - Fix mistrallarge to mistralai/mistral-large-2512 Added new models: - qwencoderfree: qwen/qwen3-coder:free (480B MoE free coding model) - llama70free: meta-llama/llama-3.3-70b-instruct:free - trinitymini: arcee-ai/trinity-mini:free (fast reasoning) - devstral2: mistralai/devstral-2512 (paid premium coding) https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

Deep analysis of how steipete's projects (mcporter, Peekaboo, CodexBar, oracle) and the current OpenRouter tool-calling model landscape can improve Moltworker. Identifies 7 architectural gaps (parallel execution, MCP integration, reasoning control, etc.) with 8 actionable recommendations prioritized by effort/impact. https://claude.ai/code/session_011qMKSadt2zPFgn2GdTTyxH

Add comprehensive tool-calling landscape and steipete ecosystem analysis

Checkpoints are now persistent: - Removed 1-hour expiry - saves persist until manually deleted - Checkpoints include task prompt for better display New save slot system for multiple projects: - /saves - List all saved checkpoints with details - /save [name] - Show checkpoint info - /saveas <name> - Backup current progress to named slot - /load <name> - Restore from a named slot - /delsave <name> - Delete a checkpoint Storage methods added: - listCheckpoints() - List all checkpoints for a user - getCheckpointInfo() - Get checkpoint metadata without full messages - deleteCheckpoint() - Delete a specific checkpoint - copyCheckpoint() - Copy between slots (for backup/restore) Also updated help message with new commands and fixed outdated model references (deepchimera/mimo → deepfree/qwencoderfree). https://claude.ai/code/session_01CoLZ1rPPP3Th81EGm55GAi

…DcOgI feat(acontext): Phase 2.3 Acontext observability integration

Add holiday banner to daily briefing using the Nager.Date public holidays API (100+ countries). Reverse geocodes user's coordinates to determine country code, queries Nager.Date for today's holidays, and displays a banner with holiday names (including local names) before the weather section. Non-blocking — gracefully skipped on any failure. - New fetchBriefingHolidays() with NagerHoliday type - Integrated into generateDailyBriefing parallel fetch - 9 new tests (689 total), typecheck clean AI: Claude Opus 4.6 (Session: 01SE5WrUuc6LWTmZC8WBXKY4) https://claude.ai/code/session_01SE5WrUuc6LWTmZC8WBXKY4

…DcOgI feat(tools): Phase 2.5.9 holiday awareness via Nager.Date API

Replace naive compressContext (keep N recent, drop rest) and estimateTokens (chars/4) with a smarter token-budgeted system that: - Assigns priority scores to messages (by role, recency, content type) - Maintains tool_call/result pairing for API compatibility - Summarizes evicted content (tool names, file paths, response snippets) - Greedy budget-filling from highest priority downward New module: src/durable-objects/context-budget.ts (pure functions) 28 new tests, 717 total passing. AI: Claude Opus 4.6 (Session: 018M5goT7Vhaymuo8AxXhUCg) https://claude.ai/code/session_018M5goT7Vhaymuo8AxXhUCg

The Acontext platform domain is acontext.io (by memodb-io), not acontext.com. Updates the default base URL in the client and the env type comment. https://claude.ai/code/session_01SE5WrUuc6LWTmZC8WBXKY4

…DcOgI fix(acontext): correct API base URL from acontext.com to acontext.io

…NF641 feat(task-processor): Phase 4.1 token-budgeted context retrieval

Audit and harden token-budgeted retrieval with safer tool pairing,\ntransitive keep-set closure, model-aware context budgets, and\nexpanded edge-case coverage plus audit documentation.\n\nAI: GPT-5.2-Codex (Session: codex-phase-4-1-audit-001)

…-budget-implementation fix(task-processor): Harden Phase 4.1 context-budget (safer tool pairing, model-aware budgets, estimator tweaks)

…check Cherry-pick best parts from Codex PR #121 on top of PR #120: - Rebalance priority scoring: tool results 40→55, plain assistant 20→18, add system role at 45 — tool evidence now survives over intermediate assistant reasoning during compression - Add final safety check to drop summary if it pushes result over budget - Update existing tests to tolerate summary being dropped on tight budgets - Add 4 new tests: summary drop, system priority, out-of-order tools All 731 tests pass, typecheck clean. https://claude.ai/code/session_01SE5WrUuc6LWTmZC8WBXKY4

…DcOgI fix(context-budget): improve priority scoring and add summary safety …

Prevent Cloudflare DO 30s CPU hard-kill by adding per-phase time budgets with checkpoint-save-before-crash behavior. - Add phase-budget.ts helper with budget constants (plan=8s, work=18s, review=3s) - Check elapsed time before each API call and tool execution - On budget exceeded: save checkpoint, increment autoResumeCount, let watchdog resume - Reset phase clock on phase transitions and checkpoint resume - Add PhaseBudgetExceededError with phase/elapsed/budget metadata - Add comprehensive unit tests for budget checks and constants https://claude.ai/code/session_01AtnWsZSprM6Gjr9vjTm1xp

…elist Replace Promise.all with Promise.allSettled for parallel tool execution so one failed tool doesn't cancel others. Add PARALLEL_SAFE_TOOLS whitelist to control which tools can run in parallel vs sequentially. - Add PARALLEL_SAFE_TOOLS set (11 read-only tools: fetch_url, browse_url, get_weather, get_crypto, github_read_file, github_list_files, fetch_news, convert_currency, geolocate_ip, url_metadata, generate_chart) - Mutation tools (github_api, github_create_pr, sandbox_exec) always sequential - Parallel path only when ALL tools are safe AND model has parallelCalls: true - Promise.allSettled maps rejected results to error messages with tool_call_id - Mixed safe+unsafe batches fall back to sequential execution - Add tests for isolation, sequential fallback, error propagation, whitelist https://claude.ai/code/session_01AtnWsZSprM6Gjr9vjTm1xp

…tries https://claude.ai/code/session_01AtnWsZSprM6Gjr9vjTm1xp

…parallel-bAtHI Claude/budget circuit breakers parallel b at hi

Fix inconsistencies left by sprint session: - GLOBAL_ROADMAP: 12→14 tools (add github_create_pr, sandbox_exec) - GLOBAL_ROADMAP: Phase 1.1 clarify client.ts still uses Promise.all - GLOBAL_ROADMAP: Add Sprint 48h section with risk mitigation note - GLOBAL_ROADMAP: Fix dependency graph Phase 1 status - next_prompt: Add sprint tasks to recently completed - WORK_STATUS: Add S48.1/S48.2 tasks, update velocity (762 tests) - claude-log: Add sprint session entry with audit notes https://claude.ai/code/session_01SE5WrUuc6LWTmZC8WBXKY4

…th real BPE tokenizer Integrate gpt-tokenizer (cl100k_base encoding) for exact token counting in the context budget system. The heuristic chars/4 estimator is kept as a safe fallback if the tokenizer throws. - New: src/utils/tokenizer.ts — countTokens(), estimateTokensHeuristic() - Modified: context-budget.ts — estimateStringTokens delegates to real tokenizer - 18 new tokenizer tests, 772 total (all passing) - Bundle impact: +1.1 MB (cl100k_base BPE ranks), well within CF 10 MB limit https://claude.ai/code/session_01SE5WrUuc6LWTmZC8WBXKY4

Best-of-5 Codex review: scored all candidate branches, extracted and fixed code from branch 4 (-8zikq4, 8/10). Adds backend route, API client types, AcontextSessionsSection component with status dots, age formatting, and responsive grid. 13 new tests (785 total). https://claude.ai/code/session_01SE5WrUuc6LWTmZC8WBXKY4

…DcOgI Claude/implement p1 guardrails dc og i

…dedup Consolidated best patterns from 4 parallel Codex implementations (PR130–133): - PR2's DRY `executeToolWithCache()` method (single entry point, no code duplication) - PR2's case-insensitive regex error detection (`/^error(?: executing)?/i`) - PR3's in-flight promise dedup cache (prevents duplicate API calls for identical parallel tool calls in the same batch) - PR3's explicit cache reset in `processTask()` (correct for DO instance reuse) - PR1's relative call-count test pattern (robust against mock accumulation) Cache only applies to PARALLEL_SAFE_TOOLS (read-only). Mutation tools (github_api, github_create_pr, sandbox_exec) always bypass cache. Error results are never cached to allow retries. 5 new tests (790 total), typecheck clean. https://claude.ai/code/session_01SE5WrUuc6LWTmZC8WBXKY4

…DcOgI feat(task-processor): Phase 4.3 — tool result caching with in-flight …

… quotes & personality Phase 4.4 — Cross-session context continuity: - Extended LastTaskSummary with resultSummary (first 500 chars of response) - Increased TTL from 1h to 24h for cross-task context - Added SessionSummary interface + ring buffer (20 entries per user in R2) - Added storeSessionSummary, loadSessionHistory, getRelevantSessions, formatSessionsForPrompt - Session context injected at all 3 system prompt sites (main, vision, orchestra) - 19 new tests for session storage, loading, relevance scoring, and formatting Phase 2.5.10 — Quotes & personality: - Added fetchRandomQuote (Quotable API) with fetchRandomAdvice (Advice Slip) fallback - Added fetchBriefingQuote exported function for testing - Quote section added to generateDailyBriefing via Promise.allSettled (zero latency impact) - Quote appears at end of briefing, silently skipped if both APIs fail - 7 new tests for quote fetching and briefing integration 820 tests pass (790 + 30 new), typecheck clean. https://claude.ai/code/session_01SE5WrUuc6LWTmZC8WBXKY4

…DcOgI feat(learnings+tools): Phase 4.4 cross-session context + Phase 2.5.10…

Implement Phase 5.5 web_search tool with Brave API execution, TTL cache,\nTaskProcessor/Telegram key plumbing, and test coverage updates.\n\nAI: GPT-5.2-Codex (Session: codex-phase-5-5-web-search-001)

Cherry-pick best of both Codex PRs: - PR 136: input validation (query.trim), Number.parseInt, error format with status code, braveSearchKey in non-DO toolContext - PR 137: tool ordering (web_search after fetch_news), vi.useFakeTimers for TTL test, briefing-aggregator test counts 15 tools https://claude.ai/code/session_01SE5WrUuc6LWTmZC8WBXKY4

PetrAnto and others added 30 commits February 5, 2026 11:08

Merge pull request #27 from PetrAnto/claude/review-merge-conflicts-yvUgX

7a58da7

debug: add granular logging around response parsing

Merge pull request #28 from PetrAnto/claude/review-merge-conflicts-yvUgX

f9fb040

fix: add heartbeat during response.text() to prevent DO death

Merge pull request #29 from PetrAnto/claude/review-merge-conflicts-yvUgX

8ae211d

fix: more aggressive heartbeat and timeout during response.text()

Merge pull request #30 from PetrAnto/claude/review-merge-conflicts-yvUgX

73460b3

fix: add retry logic for DeepSeek API timeouts

Merge pull request #31 from PetrAnto/claude/review-merge-conflicts-yvUgX

16ec044

Claude/review merge conflicts yv ug x

Merge pull request #32 from PetrAnto/claude/review-merge-conflicts-yvUgX

a52dbc9

fix: add 60s timeout on initial fetch for streaming

Merge pull request #33 from PetrAnto/claude/review-merge-conflicts-yvUgX

59bdef6

fix: add unique query param to bypass stale connection pooling

Merge pull request #34 from PetrAnto/claude/review-merge-conflicts-yvUgX

9fa3795

fix: increase streaming idle timeout to 45s for network resilience

Merge pull request #35 from PetrAnto/claude/review-merge-conflicts-yvUgX

5ffa8b0

fix: use Promise.race timeout on reader.read() for mid-stream hangs

Merge pull request #36 from PetrAnto/claude/review-merge-conflicts-yvUgX

7e9a8e8

Claude/review merge conflicts yv ug x

Merge pull request #37 from PetrAnto/claude/review-merge-conflicts-yvUgX

37b3559

fix: add resume instruction to break re-acknowledgment loop

Merge pull request #40 from PetrAnto/claude/analyze-tool-calling-5ee5w

5d85898

Add comprehensive tool-calling landscape and steipete ecosystem analysis

PetrAnto and others added 26 commits February 18, 2026 15:23

Merge pull request #116 from PetrAnto/claude/implement-p1-guardrails-…

92152bf

…DcOgI feat(acontext): Phase 2.3 Acontext observability integration

Merge pull request #117 from PetrAnto/claude/implement-p1-guardrails-…

29d017b

…DcOgI feat(tools): Phase 2.5.9 holiday awareness via Nager.Date API

fix(acontext): correct API base URL from acontext.com to acontext.io

5387bd2

The Acontext platform domain is acontext.io (by memodb-io), not acontext.com. Updates the default base URL in the client and the env type comment. https://claude.ai/code/session_01SE5WrUuc6LWTmZC8WBXKY4

Merge pull request #118 from PetrAnto/claude/implement-p1-guardrails-…

e6a2037

…DcOgI fix(acontext): correct API base URL from acontext.com to acontext.io

Merge pull request #119 from PetrAnto/claude/implement-p1-guardrails-…

145a55c

…NF641 feat(task-processor): Phase 4.1 token-budgeted context retrieval

Merge pull request #120 from PetrAnto/codex/audit-and-improve-context…

6a0b501

…-budget-implementation fix(task-processor): Harden Phase 4.1 context-budget (safer tool pairing, model-aware budgets, estimator tweaks)

Merge pull request #122 from PetrAnto/claude/implement-p1-guardrails-…

4c1df03

…DcOgI fix(context-budget): improve priority scoring and add summary safety …

Add files via upload

b26b31a

docs(roadmap): update changelog with phase budget + parallel tools en…

07c4d1a

…tries https://claude.ai/code/session_01AtnWsZSprM6Gjr9vjTm1xp

Merge pull request #123 from PetrAnto/claude/budget-circuit-breakers-…

456a21a

…parallel-bAtHI Claude/budget circuit breakers parallel b at hi

Merge pull request #129 from PetrAnto/claude/implement-p1-guardrails-…

c8c62a8

…DcOgI Claude/implement p1 guardrails dc og i

Merge pull request #134 from PetrAnto/claude/implement-p1-guardrails-…

57c8fbb

…DcOgI feat(task-processor): Phase 4.3 — tool result caching with in-flight …

Merge pull request #135 from PetrAnto/claude/implement-p1-guardrails-…

d954019

…DcOgI feat(learnings+tools): Phase 4.4 cross-session context + Phase 2.5.10…

Create code-mode-mcp.md

21930be

Add files via upload

84ee68a

PetrAnto force-pushed the claude/implement-p1-guardrails-DcOgI branch from 20ea74f to ae6a103 Compare February 20, 2026 20:03

PetrAnto and others added 2 commits February 20, 2026 20:07

feat(tools): add Brave web_search tool integration

3796a7b

Implement Phase 5.5 web_search tool with Brave API execution, TTL cache,\nTaskProcessor/Telegram key plumbing, and test coverage updates.\n\nAI: GPT-5.2-Codex (Session: codex-phase-5-5-web-search-001)

PetrAnto force-pushed the claude/implement-p1-guardrails-DcOgI branch from ae6a103 to 457ce29 Compare February 20, 2026 20:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Claude/implement p1 guardrails dc og i#266

Claude/implement p1 guardrails dc og i#266
PetrAnto wants to merge 322 commits intocloudflare:mainfrom
PetrAnto:claude/implement-p1-guardrails-DcOgI

PetrAnto commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

PetrAnto commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants