Skip to content

Add version history page and prosemirror-collab step sync#127

Open
hendriebeats wants to merge 53 commits into
masterfrom
version-history
Open

Add version history page and prosemirror-collab step sync#127
hendriebeats wants to merge 53 commits into
masterfrom
version-history

Conversation

@hendriebeats

@hendriebeats hendriebeats commented Feb 21, 2026

Copy link
Copy Markdown
Contributor

Adds a version history page for documents and replaces the last-writer-wins sync model with a proper OT (operational transform) collab protocol using prosemirror-collab.

What changed

Version history page (/document/:id/history)

  • New sidebar listing editing sessions grouped by day/week/month
  • Click a session or sub-session to preview the document at that version
  • "Restore" button saves the preview doc to sessionStorage and reloads the editor, which sends a SaveDoc to persist it
  • "Document created" entry at the bottom previews the baseline snapshot before any edits

prosemirror-collab OT sync

  • Editor now uses prosemirror-collab to track a version number and produce/receive typed steps instead of raw JSON blobs
  • Backend stores each step in a new document_step table (versioned per doc)
  • On submit: if the incoming version matches the server, steps are accepted and broadcast; if not, the sender gets a DocConflict with the missing steps to rebase against
  • DocConfirmed / DocConflict / DocUpdated replace the previous single DocUpdated event
  • Multiple editors of the same document are now tracked in a Map ConnId Connection (replacing the single openedBy slot) and all receive broadcasts

Snapshots

  • A baseline snapshot is inserted when a document is created or first opened (legacy documents)
  • Periodic snapshots every 50 versions (taken on SaveDoc) bound how many steps need replaying
  • History reconstruction: find nearest snapshot ≤ target version, replay steps from there

Reviewer Q&A

Why insertSnapshotIfAbsent with raw SQL instead of Beam?
Beam does not expose ON CONFLICT DO NOTHING. Two users opening the same legacy document concurrently could both try to insert a baseline snapshot; this prevents a unique-constraint error.

Why is initVersion set to e.doc.version (current server version) on restore instead of the restored content's original version?
After a restore, the old content is immediately saved to the server at the current version via SaveDoc. The collab plugin needs to agree with the server on version number so subsequent steps are accepted — using the old version would cause every step to be treated as a conflict.

Why does handleOpenDoc still send the full document via SaveDoc protocol on restore?
The server has no dedicated "restore" endpoint; reusing SaveDoc is the minimal-change approach that correctly updates the document column and advances the snapshot baseline.

Why replicate n payload.clientId when building StepsPayload.clientIds?
The collab protocol assigns one clientId per step. Since a single InUpdated message can carry multiple steps but they all come from the same client tab, we broadcast [clientId, clientId, …] to match the per-step shape that receiveTransaction on receivers expects.

Why was OutDocOpenedOther (kick-out logic) removed?
The old model allowed only one editor at a time per document. With the OT model any number of editors can coexist, so the kick-out message is no longer needed or sent.

Why is maybeTakeSnapshot called inside handleSave rather than handleUpdated?
handleSave already receives the full document JSON (the client sends it). handleUpdated only receives steps; reconstructing the document there would require replaying all steps, which is wasteful given that the client does it anyway and sends the result via SaveDoc.

Why sessionClientId = computerId + "_" + randomStr instead of just computerId?
computerId is stable across tabs; if two tabs are open on the same machine and both send steps, the server would see duplicate clientIds and receiveTransaction would reject one as a duplicate. The random suffix makes each tab session unique.

hendriebeats and others added 11 commits February 21, 2026 14:16
Introduces server-side step storage and version tracking as the
foundation for real-time collaborative editing and version history.

- Migration: adds `version` column to `document`; creates
  `document_step` (persistent step log) and `document_snapshot`
  (periodic full-doc checkpoints every 500 steps) tables
- Database.hs: Beam types for both new tables; `version :: Int32`
  field on DocumentT
- Types.hs: DocVersionNum newtype
- Entity/Document.hs: getDocVersion, updateDocVersion, insertSteps,
  getStepsSince, insertSnapshot, getLatestSnapshot, maybeTakeSnapshot;
  GetDoc now includes current version (sent to client on open)
- Api/Websocket.hs: InUpdated now carries structured payload
  {version, steps, clientId}; handleUpdated performs a transactional
  version check — inserts steps and replies OutDocConfirmed on match,
  or returns OutDocConflict with steps-since on mismatch; handleSave
  triggers a snapshot when version % 500 == 0

Frontend collab integration (prosemirror-collab) comes in a follow-up
batch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the ad-hoc step-sending approach with the official collab
plugin, enabling proper version tracking and conflict resolution:

- Add prosemirror-collab dependency; register collab() plugin after
  history() with per-tab sessionClientId (computerId + UUID)
- Send {version, steps, clientId} payloads instead of raw step arrays
- Handle DocConfirmed (server ack) via receiveTransaction to clear the
  unconfirmed buffer; inflight flag prevents duplicate sends
- Handle DocConflict (version mismatch) by applying server steps and
  letting sendableSteps rebase automatically on next dispatch
- Add version field to DocRaw and LastUpdate types
- DocUpdated broadcast now carries {version, steps, clientIds} so the
  group-study side panel uses receiveTransaction too
Exposes two authenticated endpoints for the history UI:
- GET /api/document/:documentId/history
  Returns step activity grouped into 5-minute buckets using raw SQL
  (date_trunc grouping not expressible in Beam DSL); restricted to
  steps authored by the requesting user.
- GET /api/document/:documentId/at-version/:versionNum
  Returns the nearest snapshot plus steps needed to reconstruct the
  document at any target version.

Both endpoints 403 if the user is not in the document's editor list.
- New /study/:id/history page with sidebar of version groups (5-minute
  buckets) and read-only ProseMirror preview of any selected version
- Reconstruct historical doc from nearest snapshot + steps client-side
  using Transform; no server-side doc reconstruction needed
- "Restore this version" saves the doc JSON to sessionStorage and
  navigates to /study/:id?restore=true; on next open the editor loads
  the restored content and saves it as a normal edit
- Parcel multi-entry: history.tsx compiled alongside index.tsx
- history.tsx loaded via /static/editor/history.js (prod) or
  Parcel dev server port (local)
- Move history API routes from /api/document/... to /document/...
  so they are served by the HTMX/Scotty server instead of the
  Servant API server
- Add "Version History" link to the study page nav menu
- Add comment explaining why two separate Parcel entry bundles
  (index.js + history.js) are safe to load independently
- Ignore .playwright-mcp and .vscode in git
The history preview was broken for documents that had never triggered a
collab snapshot (every 500 versions). It would either fetch a snapshot
ahead of the requested version or return an empty object, causing
ProseMirror to throw when reconstructing the doc.

- Add `getLatestSnapshotBefore` to only consider snapshots at or before
  the target version, preventing a future snapshot from being used as
  the replay baseline
- Add `getDocBase` as a fallback that reads the document's saved JSON
  and version directly from the document table
- Fix the frontend API route prefix (/document/ not /api/document/)
- Guard against an empty snapshot object before calling
  Node.fromJSON, falling back to an empty top-level node
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Raise ProseMirror newGroupDelay from 133ms to 250ms so undo steps
  group at a more natural typing pause
- Change version history sub-group buckets from 133ms to 30-second
  windows, giving meaningful granularity within each 5-minute group
- Fix baseline snapshot logic: capture a snapshot when none exists yet
  (not only at version 0), stamped at the current version so history
  can replay forward correctly
… types

- Fix restore not persisting: restored doc is now written to localStorage
  and saved directly over the WebSocket immediately on page load, so
  navigating away and back no longer reverts to the pre-restore content.
  Root cause: onUpdate only fires on user edits, not on editor init, so
  the restored doc was displayed but never pushed to the server or local
  cache.
- Separate checkRestoreDoc from checkLastUpdateFallback so the restore
  path is explicit and sessionStorage is consumed exactly once.
- Collapse SubHistoryGroup into a type alias for HistoryGroup (identical
  shape in both Haskell and TypeScript).
- Add aria-expanded / aria-controls to accordion buttons for screen
  reader support.
- Suppress decorative CSS triangle characters from the accessibility
  tree via content alt-text syntax.
- Remove dead commented-out visualViewport resize listener.
The top-level accordion items previously showed raw ProseMirror step
counts, which were meaningless to users (a single word could be dozens
of steps). They now show how many 30-second sub-sessions are inside,
which matches exactly the number of items revealed on expand.

- SQL: COUNT(DISTINCT floor(epoch/30)) instead of COUNT(*) to count
  distinct 30-second buckets within each 5-minute group
- Label changed from "edits" to "sessions" to match the new meaning
hendriebeats and others added 2 commits February 21, 2026 19:54
- Replace bare placeholder text with a rich empty state (icon, heading,
  description, hint arrow) to orient new users
- Add a version pill in the preview toolbar showing which snapshot is
  open; auto-hides via ResizeObserver when the toolbar is too narrow
- Add a "← Versions" back button (hidden on desktop, shown on mobile)
  so users can return to the list after opening a preview
- Implement a CSS-only mobile layout: sidebar and preview swap via
  the `preview-open` class on #historyLayout; JS toggles it when a
  version is selected or the back button is clicked
- Rename `historyPreviewActions` → `previewToolbar` for clarity
- Add a hint paragraph below the "Versions" header
- Fix triangle expand/collapse glyphs to use Unicode escapes so they
  render reliably across browsers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Swaps the janky document-name hyperlink in the history page header for a
clean back-arrow icon (arrow2.svg) linking to the document, followed by
a static "Version History" heading. Prevents long titles from breaking
the mobile layout. Adds hover animation (gray circle + scale) and a
keyboard focus ring for accessibility.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@hendriebeats hendriebeats changed the title Add version history and step-based collaborative editing Add version history and step-based editing updates Feb 22, 2026
- Show each bucket's latest edit time (endedAt) instead of earliest
- Move the chevron expand button to the right side of each accordion row
- Widen the expand button from 32px to 40px for easier tap/click target
- Insert a baseline snapshot at version 0 when a document is created,
  ensuring history replay has a starting point without relying on a
  lazy first-edit heuristic in the WebSocket handler (which is removed).
- Change the top-level "step count" to count distinct 1-minute buckets
  of activity (i.e. editing sessions), and relabel the pill as
  "N session(s)" instead of "N edit(s)" to match the new semantics.
- Drop `stepCount` from `SubHistoryGroup` — sub-items now show only
  the time, keeping the expanded view uncluttered.
- Sub-item left margin and padding tweaks for better visual alignment.
- Fall back to the document's saved content (getDocBase) when no collab
  snapshot exists, so legacy documents reconstruct correctly instead of
  showing a blank document.
- Mark loadSubGroups failures as retryable: only set subLoaded=true on
  success so collapsing and re-expanding retries the network request.
- Add focus-visible rings to history sidebar buttons for keyboard
  accessibility.
- Remove duplicate overflow-y scroll on #historyList outer container.
- Remove dead .historyStepCount CSS rule (class no longer emitted).
- Fix media query indentation for #restoreButton.
- Add aria-hidden to decorative clock emoji in empty state.
- Correct migration comment: snapshot cadence is every 50 steps, not 500.
@hendriebeats hendriebeats changed the title Add version history and step-based editing updates Add version history page and prosemirror-collab step sync Feb 23, 2026
hendriebeats and others added 5 commits February 23, 2026 18:43
Without a baseline snapshot, getDocAtVersion fell back to getDocBase,
which returns the document's current (latest) version number and content.
Because that version was already past any requested targetVersion,
getStepsSince returned nothing and every preview rendered the same
latest content.

Two fixes:

- Websocket.hs: before inserting steps, if no snapshot exists at or
  before currentVersion, save document.document as a baseline snapshot.
  This gives the step-replay logic a valid anchor for all future edits,
  including the first editing session on legacy documents.

- DocumentHistory.hs: when the getDocBase fallback returns a version
  already ahead of targetVersion (legacy doc with no pre-step snapshot),
  fall back to (0, emptyDoc) instead of silently serving the wrong
  content for every version.

Note: steps already recorded without a baseline (e.g. documents edited
before this fix) cannot be reconstructed retroactively, as the
pre-step-1 document state was never captured.
Two bugs in the version history feature:

1. The baseline snapshot for legacy documents was being saved in
   handleUpdated using document.document from the DB, which could be
   stale relative to what the client was actually editing (e.g. when
   the client loaded from localStorage). This caused the version
   history preview to show content that differed from the real
   document. Fix: move baseline snapshot logic to handleSave, where
   the client explicitly sends its current document content. This
   ensures the snapshot always reflects what the client is editing.

2. QuestionsView placed noQuestionsText inside contentDOM. ProseMirror
   manages contentDOM and when it detects DOM content that does not
   match the document model it reconciles by generating a replace step,
   converting the placeholder text into a real question node. Fix:
   move noQuestionsText outside contentDOM into a wrapper td; make
   contentDOM a div inside that same cell. The update() method now
   also toggles placeholder visibility when questions are added or
   removed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The version list had no way to view a document's original state before
any edits. This was especially confusing for documents that predated
version history: opening and immediately editing would lump the
baseline and the first session into one indistinguishable group.

- Pin a flat "Document created" item at the bottom of the history list,
  visually styled like a group-level entry but without an accordion
- Clicking it previews startVersion-1, the baseline snapshot taken
  before the very first step, so the pre-edit state is always reachable
- The first session's edits remain in the normal accordion list so
  individual changes within that session are still accessible
- Adds .historyFirstVersion CSS class with matching selected/hover/focus
  styles; deselection logic updated to cover the new class
Move the "capture baseline snapshot" logic from the DocSave handler into
the DocUpdated (step-insertion) handler, and execute it before any steps
are written.

Previously, the snapshot was taken during DocSave using the
client-supplied document content, which could differ from the server's
stored state (e.g. if the client had localStorage content ahead of the DB).
This caused history replay to start from a potentially incorrect baseline.

By capturing the snapshot in DocUpdated — before any steps are inserted —
and using getDocBase to read the actual stored document, replay is
guaranteed to start from the server-authoritative pre-edit state.
Move baseline snapshot creation from the DocUpdated (step-insertion)
handler to handleOpenDoc, and use an idempotent INSERT … ON CONFLICT DO
NOTHING variant (insertSnapshotIfAbsent) to handle concurrent opens of
the same legacy document safely.

Previously the snapshot was captured just before the first steps were
inserted, which required threading snapshot logic into the hot edit path.
Doing it at open time is simpler, happens before any edits arrive, and
correctly reflects the server-authoritative state at the moment the user
opens the document.

- Add insertSnapshotIfAbsent using raw SQL with ON CONFLICT DO NOTHING
  to avoid duplicate-key errors when multiple users open the same doc
  simultaneously.
- Remove the snapshot check/insert from handleUpdated.
@hendriebeats hendriebeats added Size: Large 401–1,000 lines: Higher risk; usually should be split unless it’s a mechanical refactor. Siee: XL / Too large 1,000+ lines: Hard to review; often requires staged PRs or special review strategy. and removed Size: Large 401–1,000 lines: Higher risk; usually should be split unless it’s a mechanical refactor. labels Feb 24, 2026
The study page template loads the editor JS from
/static/editor/index.js (served by Scotty's static middleware)
whenever the env is not Dev "local" — which includes CI. The
workflow was starting a Parcel dev server instead of building,
so the file never existed and the editor never initialised.

The "Wait for frontend to be ready" health-check on port 3001
was hitting Scotty (which always runs there), masking the failure.

- Replace `npm run dev` + port-poll with `npm run build`
- Copy frontend/dist/{index,history}.js to backend/static/editor/
Downloading Chromium + font packages on every run was slow.
Cache ~/.cache/ms-playwright keyed on e2e/package-lock.json so the
browser binary is reused across runs. On cache hit, only system-level
deps are (re)installed via playwright install-deps, which is fast since
most packages are already present on ubuntu-22.04.
Three additional caches to reduce build time:

- ~/.cabal/packages + ~/.cabal/store + backend/dist-newstyle, keyed on
  backend.cabal with a restore-key fallback. The Cabal store holds all
  compiled Haskell dependencies, which is the most expensive step to
  rebuild. A partial hit (source changed, deps unchanged) still reuses
  the compiled deps.
- frontend/node_modules keyed on frontend/package-lock.json
- e2e/node_modules keyed on e2e/package-lock.json

npm install steps now use --prefer-offline so a warm node_modules cache
avoids network fetches entirely.
window.isLocal is only set in Dev "local" mode. In CI (Dev "ci"),
it is unset, so the frontend was connecting over wss:// against a
plain-HTTP server — the handshake failed and the editor never
initialised.

Deriving the protocol from window.location.protocol is correct for
all environments: http: → ws://, https: → wss://.
nix develop pulls down HLS, hoogle, hlint, fourmolu, lsp, etc. on
every run from cache.nixos.org. Adding nix-community/cache-nix-action
persists /nix/store between runs, keyed on flake.lock. A restore-key
fallback allows partial hits when only some inputs change.
- Add Playwright JSON reporter in CI (writes to test-results/results.json)
- After tests (pass or fail), post a formatted markdown table to the PR
  showing each test's status, name, and duration
- Uses a hidden marker comment so re-runs update the same comment rather
  than creating new ones
@github-actions

github-actions Bot commented Feb 24, 2026

Copy link
Copy Markdown
E2E Tests ❌ — 6 passed, 8 failed
Test Duration
collab-sync.spec.ts › collab sync › tab 1 types, tab 2 receives the update via DocUpdated 5.5s
collab-sync.spec.ts › collab sync › tab 2 types, tab 1 receives the update via DocUpdated 5.2s
collab-sync.spec.ts › collab sync › concurrent edits resolve via OT: both tabs eventually contain both strings 5.7s
collab-sync.spec.ts › collab sync › second tab opening an already-edited document receives the full document via DocOpened 8.6s
version-history.spec.ts › version history › history page loads with "Version History" heading 3.0s
version-history.spec.ts › version history › history page shows edit sessions after typing in a document 7.4s (2 retries)
version-history.spec.ts › version history › clicking a history group loads a read-only preview 8.3s (2 retries)
version-history.spec.ts › version history › "Document created" entry is visible at the bottom of the list 7.3s (2 retries)
version-history.spec.ts › version history › expanding an accordion reveals sub-items 7.6s (2 retries)
version-history.spec.ts › version history › restore flow: restoring a version lands on the study page with correct content 7.5s (2 retries)
version-history.spec.ts › version history › "Document created" entry click loads a read-only preview 7.1s (2 retries)
version-history.spec.ts › version history › clicking a sub-item in the accordion loads a preview 7.6s (2 retries)
version-history.spec.ts › version history › back button closes the preview panel 7.6s (2 retries)
version-history.spec.ts › version history › history link in the study editor navigates to the history page 1.7s

Tests the WebSocket collaboration flow end-to-end using two isolated
browser contexts (same user, separate cookie jars):

- Tab 1 types → Tab 2 receives via DocUpdated (and vice versa)
- Concurrent edits trigger DocConflict → OT rebase → both tabs converge
- Cold open of an already-edited document replays steps from DocOpened

Adds:
- openStudy molecule — navigates a second page to an existing study
  and waits for the editor to become interactive
- EditorAtoms.assertContainsWithTimeout — SLA-bounded sync assertion
- collab-sync.spec.ts organism with the four scenarios above
- Branch suite trimmed to version-history + collab-sync only

Also fixes navigateToHistory to remove the assertPageLoaded call that
was causing a layer violation (molecule calling an atom assertion), and
adds editor.assertVisible() before editor.assertContains() in the
restore-version test so failures identify the correct missing element.
- e2e.yml: ignore PRs targeting main/master/production so tests only
  run on feature branch PRs, not on merge PRs
- production.yml: remove pull_request trigger entirely so deployments
  only happen on direct push to the production branch
stats.ok is unreliable in Playwright 1.58 — replace with
stats.unexpected === 0 as the success condition.
The comment step previously only ran on pull_request events, so pushes
to a feature branch (e.g. triggered by a push trigger) silently skipped
it. This meant the PR comment was never created or updated when CI ran
via push.

- Add push trigger to e2e.yml (branches-ignore: master)
- Remove pull_request guard from the comment step
- On push events, look up the open PR for the branch via the API;
  skip gracefully if none exists yet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Siee: XL / Too large 1,000+ lines: Hard to review; often requires staged PRs or special review strategy. Work in Progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant