fix(deps): update dependency agents to ^0.14.0#88
Open
renovate[bot] wants to merge 1 commit into
Open
Conversation
523234a to
0cab6c9
Compare
0cab6c9 to
8c04c20
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
^0.9.0→^0.14.0Release Notes
cloudflare/agents (agents)
v0.14.5Compare Source
Patch Changes
#1613
124a47aThanks @threepointone! - Introduce the first Think framework layer for convention-driven agent apps.This release adds a manifest-driven Vite plugin that discovers agents from the
agents/directory, generates a Worker entrypoint and virtual frameworkmodules, derives stable Durable Object class names, and merges framework-owned
Worker config defaults with user Wrangler config. It also keeps the Think Vite
plugin usable directly in normal Vite plugin arrays.
The framework now supports optional app server entries, manifest-scoped friendly
agent and sub-agent routing, deterministic route surfaces, colocated skill
detection, Worker Loader requirement diagnostics, and explicit diagnostics for
unsupported nested sub-agent conventions. Think currently supports top-level
agents and one sub-agent layer; deeper nesting is rejected with guidance so that
the routing and lifecycle model can be designed deliberately.
This framework layer is experimental: both the Vite plugin (once, on build
start) and the
thinkCLI (on startup) emit a notice that the API may changeor be removed in any release. The core Think agent runtime is unchanged.
The Think CLI now includes
think init,think inspect, andthink types.think initscaffolds a minimal Workers/Vite Think app, safely handles promptedor named target directories, refuses unsafe migrations, and installs npm
dependencies by default.
think inspectexposes manifest/config diagnostics intext or JSON, while
think typesgenerates Think-owned declarations and canoptionally compose with Wrangler type generation.
This release also adds host-framework coverage for React Router and TanStack
Start, updates examples to use the convention-first framework shape, and hardens
Agents/worker-bundler virtual modules for bundled skill compatibility.
#1613
124a47aThanks @threepointone! - Compile skill scripts ahead of time and remove the in-Worker bundler (drops ~14MB ofesbuild-wasmfrom Worker bundles).Skill scripts are now always compiled to self-contained JavaScript before they run, and the runtime no longer ships an in-Worker bundler (
@cloudflare/worker-bundleris no longer a dependency ofagents):scripts/*.ts/.tsx/.js/.mjs) with esbuild at build time — resolving sibling imports and stripping TypeScript — and marks themprecompiled.compileSkillScripthelper is exported fromagents/skills/compilefor use in your publish/upload tooling.Breaking: if you ship raw TypeScript or multi-file skill scripts to R2 (or another dynamic source) and relied on the in-Worker bundler to compile them at runtime, bundle them ahead of time (e.g. with
compileSkillScript) before upload. Bundled skills handled by the Vite plugin require no changes. The previously-addedstubWorkerBundleroption has been removed (there is nothing left to stub).v0.14.4Compare Source
Patch Changes
#1693
6496c80Thanks @threepointone! - FixAIChatAgentorphaned-stream recovery merging a new assistant turn into the previous assistant message (#1691).When a stream was interrupted before its final assistant message was persisted (Durable Object hibernation, deploy churn, isolate restart, reconnect), orphan recovery reconstructed the message from stored chunks. If those chunks carried no provider
start.messageId— the common case — recovery fell back to the last assistant message in history. That is correct for a continuation, but wrong for a normal new turn after a later user message: the recovered chunks for the new turn were appended onto the previous assistant message, corrupting both the persisted transcript and future model context.The assistant message id allocated when a stream starts is now persisted in the resumable-stream metadata (
ResumableStream.start()recordsmessage_id). When the reconstructed chunks carry no providerstart.messageId— the common case, and the one that triggered the bug — orphan recovery now uses this stored id instead of the last-assistant fallback, so a new turn becomes its own message and a continuation still merges into the message it was extending (it stored the cloned last-assistant id). A providerstart.messageId, when present, still wins, matching the live path which adopts it for new turns. Stream rows written before this release have no stored id and keep the previous behavior (provider id if present, otherwise the last assistant message). The metadata migration adds a single column, guarded by a schema check so it runs only once.This also fixes two related variants of the same corruption on the durable (
chatRecovery) continuation path:toolCallIdalready exists on the message.onChatRecoveryreturned{ persist: false }— recovery would "continue" it by cloning the previous assistant message, merging the new turn into it. Recovery now detects that the conversation leaf is still the user message (no partial to continue) and re-runs the turn fresh, so it becomes its own message.@cloudflare/thinkis unaffected — its session-tree recovery already allocates a distinct message id per orphan and never falls back to the last assistant message.v0.14.3Compare Source
Patch Changes
1e49880Thanks @threepointone! - Batch and pack chat-persistence SQLite writes to reduce rows written and round-trips.agents:ResumableStreamnow packs each buffered group of stream chunks into a single SQLite row (a JSON array of chunk bodies) instead of writing one row per chunk. Single-chunk and large-chunk segments are stored unwrapped, and a per-segment byte cap keeps rows within the 2 MB SQLite row limit. This cuts chunk rows written / stored / scanned-on-replay by up to ~10×. Reads (replay, orphan reconstruction,getStreamChunks) transparently unpack both packed segments and legacy per-chunk rows, so existing stored data keeps working. Adds sharedbuildInClauseStringsandMAX_BOUND_PARAMShelpers exported fromagents/chat.@cloudflare/ai-chat: message cleanup (stale-row pruning andmaxPersistedMessagesenforcement) previously issued oneDELETEper row in a loop; it now deletes rows in batchedDELETE ... WHERE id IN (...)queries (capped at 100 bound parameters per query).@cloudflare/think:deleteSubmissions()cleanup previously issued oneDELETEper terminal submission (up to 500 per call); it now deletes rows in batchedDELETE ... WHERE submission_id IN (...)queries.@cloudflare/ai-chat&@cloudflare/think: chat-recovery incident TTL sweep previously deleted each stale incident with a separate awaitedstorage.delete(key)(which also defeats Durable Object write-coalescing); it now deletes incidents in batchedstorage.delete(keys)calls (up to 128 keys per call).v0.14.2Compare Source
Patch Changes
#1684
ab6dd95Thanks @threepointone! - warn whenchatRecoveryis configured inonStart()(applied too late for wake recovery)On every Durable Object wake the SDK evaluates chat-recovery budgets — and may seal an interrupted turn, firing
onExhausted— before the user'sonStart()runs (_checkRunFibers()is ordered ahead ofonStart()). AchatRecoveryconfig produced insideonStart()is therefore read as the built-in defaults at the moment recovery decides, so a configuredmaxRecoveryWork/shouldKeepRecovering/onExhaustedsilently never applies to the recovery that matters.This is now documented on
ChatRecoveryConfigand thechatRecoveryfields ofThink/AIChatAgent, and the SDK logs a one-time warning if it detectschatRecoverybeing reassigned duringonStart(). The warning fires both for a custom config object and forchatRecovery = true(enabling recovery / its defaults too late); assigningfalse(disabling) inonStart()is intentionally not warned, since recovery already ran with the pre-onStart()value and disabling it afterward is a benign no-op for that wake. The fix is to assignchatRecoveryas a class field or in the constructor.#1672
f96a2baThanks @threepointone! - fix(chat-recovery): a turn making forward progress now survives unbounded deploy churn; add a work budget +shouldKeepRecoveringrunaway guardDurable chat recovery used to bound a single incident with a non-resetting 15-minute wall-clock ceiling (
CHAT_RECOVERY_MAX_WINDOW_MS). That ceiling was overloaded — it served as both a recovery-duration bound and a runaway-loop guard — and it terminated healthy, actively-progressing turns that simply took longer than 15 minutes of wall-clock to finish while being repeatedly interrupted by a dense deploy window, sealing them withreason="max_recovery_window_exceeded"and discarding completed work.The two jobs are now decoupled (see
design/rfc-chat-recovery-work-budget.md):chatRecovery.maxRecoveryWorkcaps the produced content/tool units since an incident opened; exceeding it seals withreason="work_budget_exceeded". Defaults toInfinity— the SDK ships the mechanism but imposes no implicit cap, so it never terminates a progressing turn on its own.chatRecovery.shouldKeepRecovering(ctx)is consulted per recovery attempt from the second onward (only when no hard bound has already sealed the incident); returningfalseseals withreason="recovery_aborted". This is where integrators express token/cost/step budgets the SDK should not hardcode. A throwing predicate is logged and treated as "keep recovering".chatRecovery.noProgressTimeoutMs(default 5 min, resets on progress) is the primary stuck-turn bound, now overridable per agent instead of a hardcoded constant.New public types from
agents/chat:ChatRecoveryProgressContext. NewChatRecoveryConfigfields:maxRecoveryWork,shouldKeepRecovering,noProgressTimeoutMs.ChatRecoveryExhaustedContext.reasongainswork_budget_exceededandrecovery_aborted;max_recovery_window_exceededis retained as an open-string value but is no longer emitted.Both
@cloudflare/ai-chatand@cloudflare/think(which carries its own copy of the recovery engine) are updated identically. Defaults are unchanged except that a progressing turn is no longer terminated by wall-clock age.#1668
d40cc8aThanks @ghostwriternr! - Fix RPC resource leaks in workflows.Workflows that use
waitForApproval()orThinkWorkflow.prompt()now release their RPC stubs promptly, preventing resource leaks and the associated "RPC stub was not disposed" warnings in your logs.#1679
c8d1d32Thanks @threepointone! - fix(sub-agents): a facet sub-agent no longer touches the root DO's WebSockets, fixing a production-only "Cannot perform I/O on behalf of a different Durable Object (Native)" crash (#1677)A sub-agent (facet) that called
setState(),broadcast(), or otherwise enumerated connections — directly or indirectly via the internal_broadcastProtocol()— could crash in production withCannot perform I/O on behalf of a different Durable Object. ... (I/O type: Native). It reproduced when the root Agent held a live (hibernatable) WebSocket connection and the child facet was freshly bootstrapped; it never reproduced inwrangler dev/miniflare, which made it hard to catch.Root cause: the
Agentoverrides ofgetConnections()andgetConnection()fell through tosuper.getConnections()/super.getConnection()for facets too. On a facet, that resolves to the host/root DO's hibernatable WebSockets, and reading their attachments from the facet's I/O context is a cross-DO native I/O access that workerd aborts.setState()tripped it only incidentally, because_broadcastProtocol()enumerates connections to compute its exclude list before sending anything.Fix: a facet's client connections are all virtual (real sockets owned by the root and bridged in), so
getConnections()/getConnection()now return only the facet's virtual sub-agent connections and never fall through to the host DO's sockets. Delivery of facet state updates to clients connected directly to the sub-agent is unchanged.#1670
5d64940Thanks @threepointone! - Fix: a deploy that interrupts an in-flightrunAgentToolchild no longer abandons the still-running child asinterrupted.Parent recovery re-attaches to a still-running child and tails it to its real terminal. Previously that re-attach used a flat 120s wall-clock budget that was not reset by the child's forward progress, so a healthy child whose recovery legitimately ran longer than the budget was sealed
interrupted(and its already-completed work re-run from scratch), even while it was actively streaming.The re-attach budget is now progress-keyed: it bounds how long the parent waits with no forward progress from the child (resetting on every forwarded chunk), so a genuinely hung/silent child still seals
interruptedafter one no-progress window and can never block recovery forever, while a healthy child that keeps streaming is followed through to terminal. The parent re-arms (opens a fresh tail) only when the child's stream closes cleanly while it is still advancing — i.e. a re-evicted-but-progressing child. A full no-progress window (the child went silent) sealsno-progressimmediately even if the child streamed earlier in that window; it no longer grants a bonus window. This is both the honest stall signal and what keeps at most one pending tail reader alive per re-attach (no per-cycle reader accumulation).@cloudflare/thinkand@cloudflare/ai-chatadditionally finalize a child facet's own agent-tool run row as soon as its recovered turn settles — regardless of whether recovery took the continue path (_chatRecoveryContinue) or the pre-stream retry path (_chatRecoveryRetry) — so a re-attached parent collects the terminal result immediately instead of waiting out a full no-progress window after the child has already finished.This release also adds:
RunAgentToolResult, theagentTool()AgentToolFailureenvelope, theonAgentToolFinishlifecycle result, and theagent-tool-eventwire event (kind"interrupted") now carry a machine-readablereason(AgentToolInterruptedReason:"no-progress" | "window-exceeded" | "not-tailable" | "inspect-timeout" | "inspect-failed" | "recovery-deadline") and achildStillRunningboolean oninterruptedresults, so callers (and UIs) can branch on why a run was abandoned (and whether the child is still running) instead of pattern-matching the human-readableerrorprose.retryablestays coarse (alwaystrueforinterrupted); refine withreason/childStillRunning. These fields are persisted (schema bump), so they survive a reconnect replay — a client that reconnects after an interrupt reconstructs the samereason/childStillRunninga live client saw, rather thanundefined. The persisted cause is cleared when a softinterruptedrow is later repaired tocompleted/error.AgentStaticOptions—agentToolReattachNoProgressTimeoutMs(default 120000, the progress-keyed no-progress budget) andagentToolReattachMaxWindowMs(defaultInfinity— no implicit wall-clock cap) — let an Agent tune re-attach. The hard ceiling defaults to uncapped to mirror chat-recovery'smaxRecoveryWork: Infinity: a re-attached parent follows a healthy, still-advancing child for as long as it makes progress — exactly as it would on the live (never-evicted) path — so it never abandons a long-running-but-healthy child that simply outlasts a fixed wall clock under deploy churn. A hung/silent child is bounded by the no-progress budget; a content-runaway is bounded uniformly (live and recovery) by the child's ownmaxRecoveryWork/shouldKeepRecovering. Integrators that want a hard wall-clock cap (and thewindow-exceededchild teardown it triggers) can setagentToolReattachMaxWindowMsto a finite value. Symmetrically, settingagentToolReattachNoProgressTimeoutMstoInfinitynow means "never seal on no-progress" (a silent-but-alive child is followed until its stream closes or the hard ceiling fires) instead of silently skipping the wait —0remains the "don't wait, collect only an already-terminal child" sentinel.window-exceededceiling — where the child has had its full recovery window and is truly exhausted — it now cancels the child (childStillRunning: false) so it stops consuming a fiber / keep-alive.no-progressgive-ups stay soft (childStillRunning: true): the child is left running so a re-issue can still re-attach and repair it if it self-heals, preserving the repair-on-re-issue path. In both@cloudflare/thinkand@cloudflare/ai-chat,cancelAgentToolRunalso aborts an in-flight chat-recovery turn (not just the original in-isolate run) and releases live tails — Think sweeps its_submissionAbortControllers, ai-chat its requestAbortRegistry(abortAllRequests) — so a torn-down child stops grinding instead of finishing an orphaned recovered turn.#1680
8f9500aThanks @threepointone! - Remove the now-redundant_suppressProtocolBroadcastsfacet-bootstrap guard.This flag was added in #1425 to stop
_broadcastProtocol()from enumerating theparent DO's WebSockets during facet bootstrap (the cross-DO Native I/O crash,
#1410/#1677). The proper fix in #1679 makes
getConnections()/broadcast()facet-safe at the source — on a facet they return only virtual sub-agent
connections and route through the parent bridge, never touching the parent's own
sockets. With that, suppressing broadcasts during bootstrap is unnecessary, and
removing it also lets legitimate state sync run during the bootstrap window.
The separate request/WebSocket/email native-handle clearing from #1425 is
retained, since #1679 does not cover that vector.
#1675
d915bc6Thanks @threepointone! - The skill runner now importsjust-bashand@cloudflare/codemodestatically instead of dynamically, and both have moved from optional peer dependencies to regular dependencies ofagents. The dynamic imports were ineffective in bundled Workers (the bundler includes them eagerly regardless) and triggeredINEFFECTIVE_DYNAMIC_IMPORTwarnings when bundled alongside@cloudflare/think, which imports them statically.@cloudflare/thinkalso now statically imports its internalExtensionManagerinstead of dynamically, removing the third such warning.#1662
df6c0d6Thanks @threepointone! - Add opt-in recovery for mid-turn context-window overflow.Compaction only fires between turns (
Session.compactAfterchecks the threshold onappendMessage). A single long, tool-heavy turn grows the prompt step-by-step inside onestreamTextloop and can exceed the model's context window mid-turn, before the next pre-turn check — the provider then 400s ("prompt is too long"/context_length_exceeded) and the turn dies terminally. Think deliberately ships no provider-specific error matching, so it could neither detect nor recover from this.This adds opt-in, provider-agnostic recovery (all default off — no behavior change unless enabled), configured through a single
contextOverflowproperty onThink:classifyChatError(error, ctx)— the app maps a raw error (or the in-stream error string) to aChatErrorClassification("context_overflow" | "rate_limit" | "transient" | "fatal" | "unknown"). Same framework-owns-the-mechanism / app-owns-the-provider-knowledge split astokenCounter. The classification is also threaded toonChatError/observers viaChatErrorContext.classification. The bundled, exporteddefaultContextOverflowClassifiercovers the common providers (Anthropic, OpenAI, Google, Bedrock, …) for apps that do not need custom classification.contextOverflow.reactive+contextOverflow.maxRetries— when a turn fails with acontext_overflowthe app classified, Think discards the truncated partial, runssession.compact(), and re-runs the turn (bounded) from the compacted history instead of dying. The partial is intentionally not persisted: the retry restarts the turn from scratch, so keeping the cut-off partial would orphan a half-finished assistant message beside the recovered answer (and duplicate any tool work the retry re-issues). A no-op compaction or a spent budget surfaces the overflow terminally throughonChatErrorwithclassification: "context_overflow"— never a silent end, never an infinite loop. Wired into the WebSocket,chat()/RPC, and programmatic (saveMessages/submitMessages) turn paths.contextOverflow.proactive— a{ maxInputTokens, headroom?, maxCompactions? }pre-step guard: when the previous step's model-reportedusage.inputTokenscrossesmaxInputTokens * (headroom ?? 0.9), Think compacts in place and feeds the recompacted history into the upcoming step, heading off the provider 400 before it happens. Keys off model-reported usage (every provider reports it), not provider error strings. Bounded per step loop by its ownmaxCompactions(default 1, independent of the reactivemaxRetriesbudget).Also adds a
chat:context:compactedobservability event (agents) emitted (once) on both proactive and reactive compaction.Notes:
streamTextre-enqueues even top-level rejections as{ type: "error" }fullStream parts, andtoUIMessageStreampasses them through without throwing), so the in-stream seam catches them on every path; the thrown-error catch path does not need separate wiring.contextOverflow.reactiveis enabled butclassifyChatErrorwas never overridden.#1675
d915bc6Thanks @threepointone! - Theagents/viteplugin now stubsturndownby default.turndown(pulled in transitively byjust-bashfor the workspace bash tool and skill runner) runs a top-levelrequire()in its Node DOM fallback, which throwsReferenceError: require is not definedat Worker startup — even when the bash tool is never used. The plugin replaces it with an inert stub so Workers deploys stay clean. Opt out withagents({ stubTurndown: false })if your app usesturndowndirectly.v0.14.1Compare Source
Patch Changes
#1659
f99f890Thanks @threepointone! - Recover one-shot scheduled work (alarms) killed by a"This script has been upgraded…"deploy/code-update, not just"Durable Object reset because its code was updated."._executeScheduleCallbackonly re-runs a one-shot schedule row after a superseded-isolate error if the error matched/reset because its code was updated/i. The platform also surfaces the same failure class as"This script has been upgraded. Please send a new request to connect to the new version."(a stub/connection to a superseded script), which fell through to the swallow-and-delete branch — the one-shot row was deleted and the work abandoned. For a queued submission this orphaned the pending row with no driver (no alarm, no retry) until something unrelated woke the Durable Object, leaving the user on an indefinite spinner.The superseded-isolate matcher now recognizes both messages, so either causes the row to be preserved and re-run on the fresh isolate under the at-least-once alarm guarantee.
"Network connection lost."is intentionally not included (it is a connection error that may succeed on in-process retry, not an isolate replacement).#1661
41315b6Thanks @threepointone! - Enforce thetool_use.inputinvariant at the chat write boundary.A streamed tool call that finishes with no
input_json_deltaevents (the model called the tool with no args), or whose input surfaces as a stringified JSON blob, could persist a non-objectinput—null,undefined,"", an array, or a raw string. The Anthropic Messages API requirestool_use.inputto be a JSON object and rejects every subsequent turn withtool_use.input: Input should be an object(verified against the live API:{}→ 200, but"",[], and[{...}]all → 400). Because the bad shape lives in durable storage, the session is wedged across reconnects, redeploys, and DO evictions.applyChunkToParts(the shared accumulator used by@cloudflare/ai-chatand@cloudflare/think) now normalizes the finalized toolinputontool-input-available/tool-input-error: a plain object passes through untouched, a stringified-JSON object is parsed, and everything else (null/undefined/""/arrays/primitives/unparseable strings) collapses to{}. A newnormalizeToolInputhelper is exported fromagents/chatso read-side transcript repair can enforce the same invariant.#1665
13d6db0Thanks @threepointone! - Await Chat SDK state-agent cleanup scheduling during startup so tests and short-lived worker isolates do not leave dangling cleanup work.#1666
01a0b35Thanks @dcartertwo! - Fix MCP OAuth PKCE verifier lookup for overlapping authorization attempts.DurableObjectOAuthClientProvidernow binds pending PKCE verifiers to the OAuth callback state instead of storing a single verifier per client/server. Callback handling runs token exchange and verifier cleanup in the returned state's context, so older auth windows and retry churn no longer exchange an authorization code with another attempt's verifier.v0.14.0Compare Source
Minor Changes
#1623
4c8b371Thanks @threepointone! -agentTool()now returns a structured failure envelope instead of an opaque error string, so a parent agent can tell a transient interruption apart from a terminal failure.Previously every non-completed sub-agent run collapsed to
{ ok: false, error: string }. A child that was reset/superseded by a deploy or parent recovery (interrupted) looked identical to a genuine failure or an intentional cancellation, so the parent model would often parrot the interruption text back to the user as if the work had permanently failed.The failure value is now
AgentToolFailure:interrupted→retryable: true(the run never reached a logical outcome; re-dispatching can succeed), and now surfaces the underlying interruption reason viaerror.aborted(intentional cancellation) anderror(genuine failure) →retryable: false.This is backward compatible for consumers that read
ok/error; the newstatusandretryablefields let an orchestration harness (or a parent prompt convention) re-run an interrupted sub-agent automatically rather than reporting it as final.AgentToolFailureis exported fromagents.#1636
f5a0d00Thanks @threepointone! - Expose recovery incident identity and enrich theonExhaustedpayload soproducts can build a terminal-state policy without re-deriving anything (#1631).
ChatRecoveryContext(theonChatRecoveryargument) now includesrecoveryRootRequestId— the stable request ID for the whole continuationchain. Unlike
requestId, it doesn't change across chained continuations, soit's the right key for per-incident budget tracking / fresh-incident detection
without re-deriving identity from message IDs.
ChatRecoveryExhaustedContext(theonExhaustedargument) now carriesrecoveryRootRequestId,terminalMessage(the exact text shown to the user),partialText/partialParts(what the turn produced before it was given upon), and
streamId/createdAt— enough to render or persist a user-facingterminal banner AND emit correlated terminal telemetry (e.g. time-since-turn-start,
stream correlation) directly, without re-deriving anything.
All fields are additive. Applied across
agents(shared types),@cloudflare/think, and@cloudflare/ai-chat.#1584
87006e2Thanks @threepointone! - Add a framework-agnostic Agent Skills engine atagents/skills: skill sources (fromManifest, R2), aSkillRegistrythat produces a catalog prompt and AI SDK activation tools (activate_skill,read_skill_resource,run_skill_script), binary-safe resource reads, and qualified cross-skill resource paths. Bundled skills are imported through the Agents Vite plugin with theagents:skillsspecifier (defaulting to a./skillsdirectory), typed via ambient declarations shipped fromagents.@cloudflare/thinkre-exports the engine asskillsand wiresgetSkills()into the turn; any AI SDK caller (including@cloudflare/ai-chat) can build aSkillRegistrydirectly.Skill loading is resilient: duplicate or failing sources are skipped with a warning (first source wins) instead of throwing. Optional, experimental script execution (
skills.runner) runs function-style JavaScript/TypeScript (export default run(input, ctx)withctx = { skill, files, workspace, tools, output }) plus path-based Python and Bash, all behind a single capability and permission bridge.#1648
d6827abThanks @threepointone! - Surface a live "recovering…" status to chat clients during durable recovery (#1620)When a durable chat turn is interrupted (a deploy/eviction, or a stream-stall
watchdog abort) and resumes, clients had no "in progress" signal — the turn
looked frozen until it completed or a terminal error was replayed. A new
cf_agent_chat_recoveringprotocol frame is now broadcast on recovery scheduleand cleared on every terminal outcome (completed/skipped/failed/exhausted), so
the indicator can't spin forever. In
@cloudflare/thinkit's also persisted andreplayed on connect, so a client that joins mid-recovery learns the turn is
working.
useAgentChatexposes a newisRecoveringflag (distinct fromisStreaming— a recovering turn isn't producing tokens yet); most UIs renderisStreaming || isRecoveringas "busy". Backward-compatible: clients that don'tunderstand the frame ignore it.
For recovery telemetry, subscribe to the
chat:recovery:*observability eventsand route them to your analytics sink.
#1611
02f9380Thanks @threepointone! - Add bounded, observable recovery foundations for durable chat turns and fibers.chatRecoverydefaults, and terminal exhaustion behavior forAIChatAgentandThink. Think recovery now exhausts after six failed attempts by default and sends a terminal error frame instead of spinning indefinitely.retryandcontinuerecovery kinds (the incident identity no longer includes the kind), guard a throwingonExhaustedhook so the terminal UX is still delivered, mark incidentsfailedwhen the recovery dispatch throws, and reclaim incident records on success plus a TTL sweep for abandoned ones so durable storage does not grow without bound.fiberRecoveryMaxAgeMsso a repeatedly-throwingonFiberRecovered()hook cannot re-trigger forever across restarts.onChatError(error, ctx)andchat:request:failed.createCompactFunction()to use a supplied token counter for tail budgeting.#1640
edb126aThanks @threepointone! - Re-attach to a still-running sub-agent (agentTool()) run on parent recovery instead of abandoning and re-running it (#1630).When a parent agent was interrupted (deploy / Durable Object eviction) while a child
agentTool()run was still in flight, recovery marked the runinterruptedwithin a ~5s window and the parent re-issued the task — re-running the child's already-completed work. For long-running children under continuous deploys this surfaced to users as "the agent went all the way back and lost the files it already wrote."Three changes fix this:
agentTool()now derives the childrunIdfrom the (recovery-preserved) tool call id (agent-tool:<toolCallId>) instead of minting a freshnanoidper call. A turn re-run by chat recovery now resolves to the same idempotent child facet rather than spawning a brand-new one, so completed child work is never re-run.runId(inrunAgentTool) and a still-running child during startup reconciliation now tail the live child to its real terminal result and collect it, instead of immediately sealinginterrupted. Re-attach is bounded by a generous wall-clock budget (DEFAULT_AGENT_TOOL_REATTACH_TIMEOUT_MS, 120s, internal): a child that keeps advancing toward terminal within the window is collected; a genuinely hung child still sealsinterruptedso recovery can never block forever.chatRecovery, but that recovery path never wrote the child's agent-tool run row — so after a real eviction the row strandedrunning(think) / was force-errored (ai-chat) and the parent could never collect the recovered result. Both@cloudflare/thinkand@cloudflare/ai-chatnow reconcile a stale child-run row from the durable transcript on inspect: while recovery is still resolving the row staysrunning; once it settles, a completed assistant response surfaces ascompleted(so the parent collects the real result) and an empty/failed recovery aserror. This keeps the child's own (working) recovery path untouched.No new public configuration. Adds an internal
agent_tool:recovery:reattachobservability event.@cloudflare/thinkand@cloudflare/ai-chatchild tails are now read-only on consumer detach (a parent's re-attach budget expiring never cancels the still-running child).#1598
f5e37bfThanks @threepointone! - AddThinkWorkflowwith durablestep.prompt()support for Workflow-owned Think reasoning steps.Patch Changes
#1623
4c8b371Thanks @threepointone! - Compaction: the Session'stokenCounternow also drives the bundledcreateCompactFunction's boundary ("what to compress") decision, not just the fire/no-fire trigger. Fixes #1593.Previously a
tokenCounterconfigured onSession.compactAfter()only influenced whether compaction fired; the boundary walk insidecreateCompactFunctionstill used the Workers-safechars/4heuristic. On tool-heavy agent histories that heuristic under-counts badly, so the configured tail budget covered the entire history andcompressEnd <= compressStart— compaction fired every turn but silently returnednull, never shortening history (strictly worse than not configuring it).Now the Session passes its counter to the compaction function via a new
CompactContextargument, andcreateCompactFunctionuses it for the tail-budget walk when no explicittokenCounterwas given onCompactOptions. So a singletokenCounteroncompactAfter()drives both "should we compact?" and "what should we compact?". When the trigger fires but compaction still returnsnull(e.g. no counter configured and the heuristic protects everything), the Session logs a one-time warning instead of looping silently.CompactFunctiongains an optional secondcontext?: CompactContextargument (backward compatible — existing one-arg functions are unaffected).Note: the flowed counter is invoked per-message during the tail walk. A tokenizer-style counter gives accurate per-message budgeting; a usage-only counter that reports a fixed whole-prompt total degrades the tail budget to
minTailMessages(compaction still runs and context stays bounded, but the byte budget is effectively ignored). For precise budgeting with such counters, pass an explicit per-messageCompactOptions.tokenCounter.#1617
5e60034Thanks @threepointone! - Scheduled callbacks no longer drop their work when an alarm fires on an isolatethat a deploy has just superseded. In that window the first
ctx.storageopthrows
Durable Object reset because its code was updated.for the entireinvocation (code never reloads mid-invocation). Previously
Agent._executeScheduleCallbackburned its in-process retries (all doomed),swallowed the error, and
alarm()deleted the one-shot row — permanentlyabandoning the work even though the next fresh invocation would succeed. This
was a second deploy-churn abandonment path for chat recovery
(
_chatRecoveryContinue/_chatRecoveryRetry) that the progress-aware budgetin
@cloudflare/think/@cloudflare/ai-chatcould not reach, because thecontinuation was deleted before it could be re-detected.
For a one-shot schedule failing with this transient, the SDK now skips the
doomed in-process retries and re-throws so
alarm()rejects: the one-shot rowsurvives and Cloudflare re-runs the alarm on a fresh isolate (= new code) under
the at-least-once alarm guarantee, so the work auto-resumes once the deploy
settles. All other callbacks and error classes keep the existing behavior.
#1608
7c17736Thanks @cjol! - Fix auto-continuation stream resumes so immediate client-tool resume requests attach to the pending continuation instead of receivingcf_agent_stream_resume_none.#1639
6bac0f4Thanks @whoiskatrin! - Prevent MCP Streamable HTTP result responses from crossing between concurrentPOST streams when a reused session receives duplicate in-flight JSON-RPC
request ids. Responses now prefer the live connection that originated their
request and return JSON-RPC internal errors instead of guessing when no origin
can safely disambiguate colliding streams.
Completion tracking for batched POST streams is now scoped per stream so an id
collision on another POST cannot prevent the original stream from closing.
#1629
7d38363Thanks @whoiskatrin! - Fix server-sideneedsApprovaltool continuations remaining stuck after theuser approves them. Think now keeps approved/denied/errored tool parts in the
model transcript, updates its live transcript before an immediate continuation,
and persists and broadcasts terminal tool output emitted for a prior assistant
message. Continuation response frames are also labelled consistently so
useAgentChatcan apply streamed continuation updates to the active UI state.A pending
approval-respondedtool is no longer mis-reported by theincomplete-tool-call backstop, so approval continuations stop logging a false
"repair gap" warning and emitting a spurious
chat:transcript:repairedevent.The cross-message tool result now flows through
StreamAccumulator'scross-message-tool-updateaction and a shared, replay-safecrossMessageToolResultUpdatebuilder (exported fromagents/chat): it matchesterminal states for first-write-wins idempotency against provider replays (e.g.
the OpenAI Responses API, #1404), preserves a streamed
preliminaryflag, andlets
Thinkskip redundant writes/broadcasts when a result is already settled.#1607
f82d897Thanks @mattzcarey! - Tighten SSE resumability inMcpAgent's streamable HTTP transport.Follow-up to #1583.
Final tool response is now actually replayable. The previous code
stored the final response in the event store and then immediately
called
clearStream(streamId)onshouldClose, deleting every eventfor that stream — including the one just written. A client that lost
the connection mid-flight could reconnect with
Last-Event-IDandfind nothing to replay. Fixed by flipping the order: write the SSE
event to the wire first, then drop the persisted
streamId -> requestIdsmapping and clear the stored events. Everyevent up to and including the final response is replayable while the
in-flight stream is open; the trade-off is that if the WS pipe is
enqueued but the client TCP dies before the bytes arrive, that one
final message is lost.
POST event store writes are unconditional, matching the
standalone path. Previously the transport relied on a live WS
connection at
send()time to record the event; if the client haddropped (common during long tool calls on flaky networks) the event
was lost. Now the transport falls back to a persisted
requestId -> streamIdreverse lookup (McpAgent.getStreamForRequestId),stores the event, and writes to the wire only if a live connection is
still attached. Reconnecting with
Last-Event-IDreplays anythingthat was missed.
Resumed connection registers under the source streamId, matching
the SDK reference. For an active POST stream the persisted
requestIdsare restored so future tool messages route to the newWS. For the standalone listen stream the connection takes over that
role. For a completed POST the connection serves as a one-shot
replay channel. In every resumable case any prior connection bound
to the same streamId is closed, so there is at most one live
connection per stream and routing stays deterministic.
One stream per message, per the MCP spec. The spec requires the
server to send each message on exactly one connected stream and
forbids broadcasting the same message across streams. Server-
initiated notifications go to the single standalone GET stream (the
transport supersedes any prior standalone GET when a new one opens),
and POST responses go to their own stream. Events are still stored
for replay when no live stream is attached.
Cleanup is immediate, not background. Each POST stream's events
are cleared the moment the close frame is written. No alarms, no
metadata index, no sweep. Storage cost is bounded by the in-flight
POST streams plus the standalone GET stream. Multi-key deletes are
chunked at the Durable Object 128-key limit, and
replayEventsAfteruses an explicit
limitso a pathological history can't OOM the DO.Standalone GET events are not cleared automatically; they accumulate
for the lifetime of the session's Durable Object.
DurableObjectEventStoreis exported so callers embeddingWorkerTransportinside an Agent / Durable Object can wire upresumability with
new DurableObjectEventStore(this.ctx.storage).#1602
cfc75bcThanks @mattzcarey! - Fix SSE keepalive and enable resumability on the MCP transports (#1583).The MCP transports had a defective SSE keepalive (
event: ping\ndata: \n\n— a named event the SSE parser dispatched with empty data, firing
addEventListener("ping", …)on the client) and no recovery path for the~5 min Cloudflare edge idle-stream watchdog. This change makes
resumability the first-class recovery mechanism while keeping the
keepalive available when resumability isn't configured.
EventStoreis configured,no keepalive; idle drops are recovered by clients reconnecting with
Last-Event-ID. Without anEventStore, the comment-frame keepalive(
: keepalive\n\nevery 25s) keeps long-lived listeners alive.calls survive the ~5 min idle watchdog. POST streams can additionally
be resumed via
Last-Event-IDwhen anEventStoreis configured: areconnecting GET inherits the original POST's
requestIdssosubsequent tool messages route to the resumed connection.
DurableObjectEventStorenow wraps each eventwith a write timestamp and exposes
sweep(maxAgeMs).McpAgentschedules a recurring sweep (default hourly, 24 hr TTL) so events from
abandoned POST streams whose clients never returned don't accumulate
forever in Durable Object storage. Streams that close cleanly are
cleared in full on the final response.
Also fixed: a pre-existing bug where an
McpAgentGET stream thatreconnected with
Last-Event-IDreceived the replayed backlog butwasn't re-tagged as the standalone SSE stream, so subsequent
server-initiated notifications had no connection to land on.
All changes are additive — patch-level, no breaking changes.
DurableObjectEventStoreis exported fromagents/mcpfor statefulWorkerTransportcallers (e.g. the elicitation example, which nowwires resumability via
eventStore: new DurableObjectEventStore(this.ctx.storage)).#1641
3aa1936Thanks @threepointone! - Count a sub-agent's progress as the orchestrating parent's recovery progressA parent turn whose work is "run a sub-agent and await its result" produced no
recoverable content of its own, so under deploy churn the parent's own
chat-recovery no-progress window could exhaust while the child was still
healthily streaming — abandoning the turn as
interruptedand collecting aninterrupted result even though the child went on t
Configuration
📅 Schedule: (in timezone Asia/Tokyo)
🚦 Automerge: Enabled.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.