[web] Agent-centric UI with dual-theme design system#12
Merged
Conversation
fix: PostgreSQL + TanStack Start migration (merge-ready)
First pass at a dark "Star Chart" theme for the dashboard while exploring a celestial identity for Seer (the seer of AI apps and neural networks). Checkpoint before reworking toward the brass + ink "Star Atlas" (dark) / "Aged Chart" (light) instrument direction.
Build spec for Seer's dual-theme celestial instrument identity: brass + ink token system (both themes), no-flash theme switch, star-grid mark, astrolabe score orb, teaching empty state, and a trace surface (flame timeline + tool-call pill). Includes verified WCAG AA contrast tables and phased file-by-file sequencing.
Add PRODUCT.md and DESIGN.md design context, link them from AGENTS.md, and vendor the impeccable design skill plus the agent-centric UI plan.
Rewrite DESIGN.md to the dual-theme celestial brass/ink system (Star Atlas dark default + Aged Chart light) and update the agent-centric UI plan with a self-contained design summary citing DESIGN.md as canonical.
Reconcile the four design resolutions (first-class/version-stable datasets, non-numeric rail health, A/B hybrid workspace, progressive disclosure) onto the canonical Star Atlas / Aged Chart plan.
Replace the dashboard with an agent rail + tabbed workspace shell, add signature components (astrolabe orb, star-grid mark, instruments), and migrate routes/components from the legacy Glean palette to the Star Atlas / Aged Chart token system.
Scope agent detail stats to the selected agent, drop transcript overfetch, warm router preloads, and negatively cache failed name lookups so switching agents does bounded work. Replace the orb's needle with an animated filled dial and remove the clock-like ticks/constellation marks.
…comparison Critique-driven fixes: - Rebuild Run Results table: remove table-fixed so the agent-response column no longer collapses, give Input/Output real min-widths, and add a keyboard-accessible expand chevron (aria-expanded). Extra dimension columns now scroll horizontally instead of overlapping the header. - Raise light-theme (Aged Chart) fg-1/fg-0 to clear WCAG AA 4.5:1 for secondary text and placeholders on both bg and surface. - Migrate runs/$id, sets/$id, and ResultsTable off legacy token aliases (cement, glean-blue, bg-white, shadow-card, amber-*) to semantic tokens; unify the hero score on the AstrolabeOrb; drop emoji trace icons. - Make the eval-set explainer collapsible and remove duplicated orb caption. New power-user features: - Command palette (Cmd/Ctrl+K) with fuzzy agent jump, quick actions, and g-h / g-s nav chords; combobox/listbox ARIA, body portal, token-only. - Run-to-run comparison at /compare/$agentId?a=&b= with a dimension-by-dimension table and signed deltas; entry point is run-row checkboxes + Compare button.
Replace the blue-grey light theme with the "Aged Atlas" direction explored in the color-scheme prototypes: - Light "Aged Chart" becomes warm aged paper with a cool dark navy-ink foreground ramp and brass kept as the lone warm accent, so surfaces and text separate clearly instead of reading as one grey field. - Dark "Star Atlas" gains a subtle ink-navy ground for more night-sky depth. - All fg steps verified against WCAG AA on both grounds (fg-1 ~4.9:1). - DESIGN.md color contract synced to the new tokens. - Add docs/plans/color-scheme-prototypes.html (3 explored directions).
Lift the light "Aged Chart" ground to a warm off-white (#ede8dd) instead of tan, switch the ink ramp to warm sepia, and add a light-theme-only paper texture (faint SVG grain + soft aging mottle on the .bg-bg ground) so the off-white reads as rustic old parchment. Cards (surface) stay clean as fresher sheets layered on the aged ground; the dark Star Atlas theme is unaffected. Contrast verified against WCAG AA (fg-1 ~4.9:1). DESIGN.md synced.
Drop docs/plans/star-atlas-prototype.html and color-scheme-prototypes.html; they were one-off exploration artifacts and the chosen directions are now implemented in the real theme tokens.
… assets Address the thermo-nuclear review findings and follow-up cleanup: Score aggregation (single source of truth): - One canonical resolver/aggregator in score-mapping.ts; remove the hardcoded DEFAULT_CATEGORY_VALUES fallback entirely. Each criterion's scaleConfig categoryValues is the only source. getRunResults now threads categoryValues to the client so the CSV export resolves on the true scale (this also fixes golden answer_accuracy, whose "none" category the old fallback never mapped). - agents.ts (weakest dimension) and run-compare.ts share aggregateRunDimensions; run comparison resolves custom dimensions via their own scale and normalizes every dimension to a 0–10 axis. Boundaries / perf: - runCount means one thing (all runs per set) across rail, datasets, overview. - Home reads the root loader instead of re-fetching the agents overview. - Extract useLocalStorageBoolean for the rail-collapse persistence. Design-system cleanup: - Migrate all remaining components off the legacy token aliases and delete the bridge from tailwind.config.cjs (glean.*, cement.*, surface.page/primary, border.subtle, white, ink, shadow card/card-hover). Pure 1:1 token renames, rendering is unchanged in both themes. - Centralize every shared glyph in components/icons.tsx (plus, chevron, search, sun, moon, cog, home, close, agent); drop the duplicated inline SVGs and the ActionIcon wrapper. - Remove orphaned prototype preview PNGs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rebuilds the Seer web UI around an agent-centric information architecture and a committed dual-theme design system (Star Atlas dark / Aged Chart light), then hardens it against a code-quality review.
/compare/$agentId?a=&b=dimension-by-dimension diff with normalized 0–10 deltas.g h/g schords.Review hardening (thermo-nuclear)
score-mapping.ts; removed the hardcoded category-value fallback so each criterion'sscaleConfigis the only source of truth (also fixes goldenanswer_accuracy).categoryValuesnow travels in the run-results payload for the CSV export.agents.tsandrun-compare.tsshare one aggregation; custom dimensions resolve on their own scale.runCountsemantics; home reads the root loader (no double fetch); extracteduseLocalStorageBoolean.components/icons.tsx; removed orphaned prototype assets.Accessibility
Test plan
pnpm --filter web typecheckcleanpnpm --filter web lint(no new errors; pre-existing warnings only)