Release 1.0.73 — composer console redesign + Git flow + dogfood set#2
Merged
Conversation
The auto-captured Workspace field now shows the home-relative form (`~/Documents/AGBench`) instead of the full absolute path — in the read-only preview, the saved bug-reports.md, and the pre-filled GitHub issue. Cleaner to read, and keeps the local home/user path prefix out of a shared issue. Extracts `tildifyHomePath` from ActivityPathDisplay (reusing the existing /Users/<user>/ -> ~/ logic, behaviour-preserving for activity rows) + 6 unit tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rity chips - Live x/140 character counter on the title (warns past 120 chars). - The 'Report saved' confirmation gains a popped checkmark + fade-in. - Selected severity chip lifts with a soft shadow + bolder label. All motion honours [data-reduce-motion='true']. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e ghost mascot Adds a reusable MascotGhost glyph (AppChromeSymbols, a leaf module) and uses it to warm three empty states: sidebar 'No chats yet' (centred ghost + a one-line nudge), sidebar 'No active runs' (small, quiet inline ghost respecting its deliberately-quiet design), and Settings 'No workspaces yet' (centred ghost above the existing hint). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tate - Dashboard tabs preview their underline faintly on hover (was colour-only) and gain a keyboard :focus-visible accent underline. - The 'No activity in the last 30 days' empty-range card now leads with the ghost mascot, consistent with the sidebar/settings empty states. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…vider colour The four original providers' tile borders referenced `--provider-X` (undefined -> stale hardcoded fallbacks like #d68a4d) while Grok/Cursor in the same block used the canonical `--provider-X-color`. Unify all six to `--provider-X-color` (theme.css source of truth: gemini #2563EB / codex #6366F1 / claude #D97706 / kimi #84A33B) with corrected fallbacks, so the tile tints match every other provider-tinted surface. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The popouts hardcoded a dark-mode palette: CodeMirror syntax + chrome colours, diff-list white striping, and missing light-theme --diff-* vars all rendered light-on-light (unreadable) under light/citrus/mist/sage/alabaster. - FileEditorPanel: CodeMirror highlight + chrome colours now read --cm-* vars. - theme.css: --cm-* defaults are the EXACT original dark palette (dark unchanged); light themes get a readable GitHub-light-ish editor palette + the missing --diff-add/del/hunk overrides. - diff list/lines: rgba(255,255,255,..) striping -> color-mix(var(--text-primary)..) so it adapts per theme. - popout BrowserWindow backgroundColor follows the theme (no dark flash on light). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ignals Orders 1 + 3 from the dogfood triage. On the run-complete card: a Cost row summing real cost_usd (per-token seats) plus a badged '~est. API-equiv' token->USD projection for subscription/credit seats (Codex/Grok/Cursor) via the provider rate table — row omitted when nothing real or estimable (no misleading $0.00); an explicit Latency (round wall-clock) row; and the previously dark-shipped ensemble escalation signals (incl. disagreement-unresolved) now render as an advisory chip with their recommendedAction copy. Logic in pure libs (runCompleteSummary.ts + new providerRateEstimate.ts), 28 new/extended tests. Makes the cost/value tradeoff visible — never shrinks the panel. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Order 2 from the dogfood triage. deleteChat only unlinked the chat JSON, orphaning that chat's run-event .jsonl + run-artifacts on disk forever. Now it also removes them, derived strictly from the chat's OWN runIds via the deterministic safeRunEventFileName mapping (never a directory scan), so a sibling chat's prefix-similar run files can't be caught. Best-effort + non-fatal (missing files ignored). 3 tests incl. the sibling-untouched anti-footgun case. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Order 4 from the dogfood triage. The approval card gains an optional one-line 'why?' note; it rides the EXISTING approval-ledger metadata channel (extraMetadata -> record.metadata.intentNote, no schema migration), is re-trimmed + capped at the untrusted IPC boundary, and surfaces read-only in the Approval Ledger panel. Always optional (never gates approve/deny), cleared between queued approvals. Surfacing the note to agents/participants is deliberately NOT done (prompt-injection risk). +1 AuditService test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…amed from' line) Order 5 from the dogfood triage. (a) The agent-visible transcript tag, roster lines, and self-label now carry a rename-stable #pN handle derived from a sort on the IMMUTABLE participant id (survives role renames AND speaking-order reshuffles), so agents can anchor on a peer across a rename. The #-prefix is resolver-safe (MENTION_REGEX needs a letter after @, so @#p3 never matches) - additive, no alias, no contract change; EnsemblePrompt tests updated. (b) The transcript shows a quiet 'renamed from X' continuity line for the human reader, preferring the session activity ledger and falling back to frozen-role-vs-current when it ages out. +17 tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Order 6 (final dogfood item). Adds 'elicit' (ask_user_question) + 'delegate' (subThreadDelegation) as first-class rows to the central ProviderxCapability matrix + the Inspector strip (5 rows -> 7). Per-provider state mirrors the existing MCP/creative logic (bridge-backed gemini/claude/kimi, AGBench-routed codex, provider-delegated grok/cursor). The two new rows are DISPLAY-ONLY: a canonical TOOLING_CONTROL_IDS (the 5 functional controls) + toolingControlRows() helper drive every enforced/delegated tally (Inspector, ProviderPreflightService, DelegationAudit, SettingsPanel), so promoting subThreadDelegation to a visible row never double-counts against its existing gate or shifts the X/5 count. Typed fan-out resolved (3 Record literals); +4 tests incl. the no-double-count invariant. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ice 1) Track F slice 1 (CSS-only, parallel-safe). Brings the native composer into the maintainer's rim-highlight idiom (a9d5482): a calm idle perimeter rim on .composer-surface drawn on a dedicated ::after (so it never disturbs the glass/classic/solid surface chrome), warming to a provider-tinted rim for the active provider; and recolours the 'Full Workspace Access' posture pill from alarm-red to an amber posture rim, readable in light + dark via color-mix(#f0a93f, --text-primary). Native shell only; reduce-motion honoured; hands off to the FX agent-aura when enabled. Running/ensemble rims TODO'd for the App.tsx hook phase. Build-verified. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lass frame (rim slice 2) Track F slice 2. Wraps the native composer's textarea + control rows in a new .composer-inner-module (theme-tone, so the input stays perfectly readable), and turns the outer .composer-surface into a CONTRAST GLASS - light/white-tinted on dark themes, dark/black-tinted on the 5 light themes - visible only as the frame around the module. The slice-1 provider rim carries onto both. Native shell only; the inner wrapper is display:contents on every other shell so codex/claude/grok/kimi/cursor + obsidian/alabaster are untouched. Reduce-motion honoured. typecheck + build green; the lone red test (IpcValidation) is a pre-existing gap in Codex's 6def607, unrelated to this diff. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…me + ensemble cleanup (rim slice 3) Track F slice 3 (live iteration). The inner module goes FULL-WIDTH + HARD CORNERS (edge-to-edge instrument panel; negative margins cancel the surface padding, clipped by the frame's rounded outer corners). The timecode/workspace/token telemetry row moves OUT of the inner module to a sibling below it. The outer frame flips to theme-POLARITY glass (near-black on dark, near-white on the 5 light themes) so the module pops against it. The schedule + runtime row is hidden in ENSEMBLE chats across ALL composer shells (not useful there + got buried under the new picker style). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dth) Slice-3's full-width used width:100% + margin-left/right:-14px, but width:100% is the CONTENT-box width — a negative margin only pulls the LEFT flush; the RIGHT fell ~28px short (visible gap, esp. ensemble). Fixed with width:calc(100% + 28px) (and +32px for welcome's 16px padding) so the module spans the full frame on both edges. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Completes Codex's 6def607 (update pill + changelog sheet): the two IPC handlers were added without IPC_ARGUMENT_SCHEMAS entries, which made IpcValidation.test.ts red. changelog-snapshot is a no-arg read; mark-changelog-seen takes an optionalString version (the handler coerces a missing/empty version defensively). These entries were authored in the working tree but left uncommitted (Codex committed the git-hardening slice 8d348fc separately and believes its own worktree is clean); committing here so the gating red is durably green ahead of the 1.0.73 bump. Flag for Codex. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per maintainer: the native composer OUTER frame should read as true black (dark) / true white (light), with no theme bleed; the inner module keeps the theme tone + gradient. Strips the --composer-bg color-mix from all six surface variants (base/glass/solid x dark/light): dark to rgba(0,0,0,.95)/.9 + #000 solid; light to rgba(255,255,255,.96)/.91 + #fff solid. High alpha so the glass blur still breathes while the frame reads black/white. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replaces the Commit Changes prompt-injection (handlePrimeCommitChangesPrompt, removed) with a real user-driven flow via the stable preload Git API: gitSnapshot to gitStage(all) to gitCommit; createGithubPr gated on readiness. New GitCommitControls component in the above-bar diff menu lazily fetches the snapshot on open and lifts it to App via onSnapshot, so the above-bar header shows live branch / changed-file count / staged state / push + PR readiness (new composer-git-state-pill). Review-changes stays the first safety step; Create-PR is disabled with an inline reason unless repo valid + branch suitable. Implements Codex's relayed request (wire to 7f2a54c preload API, renderer only). Backend files untouched. Follow-up: consume Codex's canonical githubPrReadiness (8d348fc) in place of the client-side computePrReadiness mirror. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two fixes from maintainer screenshots of the pure-black/white pass: DARK outer still showed the theme side-vignette: a flat background can't override the .composer-surface::before gradient layer (base + per-provider tints + ensemble shard-09 tints). Kill it for the default shell's chat composers (.app-transcript) so the outer is a solid black/white bezel; the ::after provider rim is kept. LIGHT inner lost its theme: with the outer now pure white, the inner's near-white --composer-bg blended in. Give the light inner a distinct themed surface (subtle vertical gradient nudged off white toward the text colour), theme-relative across light/citrus/mist/sage/alabaster; the extra .app-transcript class beats the glass/solid inner variants. Dark inner unchanged (confirmed correct); welcome surface untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
In single-provider native chats the schedule + runtime-profile row (.composer-top-toggles) was absolutely positioned (top:10px, z-index:2) and got painted behind the console redesign's inner module (also z-index:2, full-bleed) — same z-index but later in the DOM, so the row was hidden. Lift it into normal flow ABOVE the inner module (it precedes it in the DOM) with z-index:3 so it reads as a clean right-aligned row over the textarea. Native chat composers only; ensemble doesn't render the row, other shells keep the floating placement. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Kimi tool calls stacked as a fresh card per event and never showed the target file. Two causes in emitCliProviderToolEvent: (1) the ToolCall branch keyed tool_id off payload.id while the ToolResult branch used payload.tool_call_id — different fields, so the renderer (which pairs result->call by tool_id) could not coalesce them; (2) Moonshot/OpenAI sends function.arguments as a JSON-encoded STRING, which the old isRecord() guard dropped to {}, so ToolParser had no file_path.
Add cliProviderToolId() — an ordered multi-field id lookup (tool_call_id/toolCallId/id/tool_id/toolId/call_id, mirroring Grok's mapper) resolved by BOTH branches so a call and its result share one id; and normalizeMcpToolArguments() to parse the string args so the filename surfaces. Kimi tool calls now coalesce into one inline card with the filename, like Codex/Claude. Best-effort (no live Kimi to test); the multi-field lookup covers all plausible result-id spellings.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
GitCommitControls gated Create-PR via a client-side computePrReadiness mirror of GitService's guards. Codex shipped a canonical, gh-aware window.api.githubPrReadiness (8d348fc) that also detects existing PRs + precise push-first state. Fetch it alongside gitSnapshot on menu-open (and after each commit) and prefer it for the gate; keep the local mirror as a graceful fallback while it loads or if the API/gh is unavailable. Removes the backend-logic duplication (#12). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a power-user tip to the first-launch sheet (section 7) covering 1.0.73's composer Git flow: Review changes menu -> branch + changed files -> Stage all & Commit -> Create PR when ready. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Ensemble row, the Create-PR / Review row, and the Steer/Queue row used the old composer look (var(--composer-bg) theme tone + a ::before side-vignette) — the treatment the console redesign already replaced on the surface — so they clashed with the new solid black/white frame below. Re-skin them to match the frame: solid pure black (dark) / white (light) fill, themed vignette killed, neutral hairline rim for definition. Provider colour stays on the chips, so the whole stack reads as one cohesive instrument. Native chat composers only (.app-transcript, single + ensemble); welcome surface + other shells untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hell) The AGBench native (default) shell was the ONLY composer shell without a control-order block, so the permission / tool-grants picker fell back to DOM order and trailed at the end of the control row. Add the standard order block (mirroring codex/kimi/grok): attach 0, permission 1, ensemble-mode 2, provider 3, workspace 4, model 5 (right-aligned). The permission picker now sits right after the + button, like every other shell. CSS order only — the shared composer JSX is untouched, so all other shells are unaffected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eviews The onboarding (FirstLaunchSheet) and Settings -> Appearance composer preview mocks predated the console redesign, so a new user previewing 'AGBench native' saw the old look. Mirror the new console: wrap the textarea + control rows in a .composer-inner-module (both mocks; layout-neutral for non-default shells via display:contents), and add dedicated preview CSS in shard 04 — solid black/white surface + above-bar (vignette killed) + a theme-tone inset inner panel — scoped strictly to the default style so the other-shell previews are untouched. (Real 03/07 console rules are .app-transcript-scoped and never reach these static cards, hence the dedicated preview CSS.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Highlights: native composer console redesign (solid frame + theme-tone inner module + aligned above-rows + permission-picker reorder), a real Git commit/PR flow in the composer, clearer Ensemble cost/escalation, optional approval intent notes, Kimi tool-call coalescing + filenames, and onboarding / Settings preview + light-theme refinements. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
boggspa
added a commit
that referenced
this pull request
Jun 12, 2026
Release 1.0.73 — composer console redesign + Git flow + dogfood set
boggspa
added a commit
that referenced
this pull request
Jun 22, 2026
…o toolbar menus (build 39) #2 (diff-pill, 3-reviewer adversarial synthesis): near-edge drag wrapped the pill to multi-line and its hit-area sat inside the iPad screen-edge-swipe gutter, blocking sidebar expansion. pillBody now lineLimit(1)+fixedSize(horizontal)+layoutPriority(1) (incompressible single line; spacers yield, no wrap/latch); maxOffsetX keep-clear edgeMargin 4->28pt (clears the ~20pt edge-pan gutter) and fails CLOSED to 0 when unmeasured (was .greatestFiniteMagnitude = no clamp, letting a first-frame drag fling the pill to the edge); restingLeadingInset gains a hard ceiling so a stale measurement can't overflow the row. #1 (toolbar style): the New + More MENU pills inherited the accent (blue) over the pill chrome's black/white foreground — force .tint(textPrimary) so they match the Refresh/Settings button pills + the inspector toolbar. iOS build 38 -> 39. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
boggspa
added a commit
that referenced
this pull request
Jun 23, 2026
…sions Roadmap #2 + #4(part 1) — two changes that must ship together: - MarkdownMessage: key the streaming tail by `tail-${stable.length}` instead of the per-keystroke djb2 content hash. The key now changes only when a block commits to the stable prefix, so the tail re-renders in place across the deltas that grow one block instead of unmount/remount every keystroke. This lets a streaming code fence reuse its CodeMirror EditorView (a cheap doc transaction) instead of destroying + recreating the whole view per delta. - HighlightedCodeBlock: useMemo the `extensions` array on `language`. A fresh array literal per render is treated by @uiw/react-codemirror as a StateEffect.reconfigure that would fire every delta and become the dominant per-delta cost once the tail key is stable; memoizing keeps the reference stable so only the value (doc) transaction runs. Adds a splitter test locking that stable.length is constant within a growing block and increments only at a commit boundary. typecheck:web clean; markdown/transcript/splitter tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
boggspa
added a commit
that referenced
this pull request
Jun 25, 2026
…Claude SDK fires exit inline (review, CRITICAL) Adversarial review of 83b3376 (Opus) found a ship-blocker. The solo-run bookkeeping (scheduledTaskIdBySoloRun.set + workflowBudgetRegistry.register) ran AFTER 'await runCoordinator.dispatch'. But the default Claude path (Agent SDK, tryRunClaudeSdk) consumes its stream INLINE and fires sendAgentCompatExit SYNCHRONOUSLY before dispatch returns — so the completion block read an EMPTY map and never marked the task terminal. A windowless (headless) solo Claude run therefore wedged 'running' until the 6h stall reconciler, which records it 'failed' and, after maxConsecutiveFailures (default 3), AUTO-DISABLES the workflow — i.e. successful recurring Claude runs silently self-destructed the workflow. The renderer path was masked (the renderer also marks completion); CLI providers were unaffected (async 'close'-handler exit, after dispatch returns). Fix: move the set + register BEFORE 'await runCoordinator.dispatch', keyed on routedPayload.appRunId (routeWithRunId preserves a set appRunId; a scheduled run always carries one from composeRun — matches the appRunId sendAgentCompatExit reads). This also fixes review finding #2 (a preflight-fail sendAgentCompatExit, fired synchronously inside dispatch, now finds the map populated → marks 'failed' instead of wedging) AND a latent same-root-cause gap where the budget kill never enforced inline SDK runs (onUsage fired before register). failoverSnapshotByRun.set stays post-dispatch (unrelated; read on a later quota-wall event). typecheck exit 0. Reviewer verified posture/ensemble/global/double-dispatch/sender/startup all clean; this was the sole ship-blocker. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ships AGBench 1.0.73.
Highlights
+. Onboarding + Settings → Appearance previews updated to match.githubPrReadiness).Full details in
CHANGELOG.md(1.0.73 entry).🤖 Generated with Claude Code