Skip to content

[Perf] Profile grid-related slowness and decide between optimisation, PyO3, or rewrite #203

@michaelbeijer

Description

@michaelbeijer

Background

Working in the grid in Supervertaler Workbench feels sluggish in several places — slow cold start, lag while scrolling/typing in large projects, latency on segment-to-segment navigation, and noticeable wait on batch operations. The instinct has been to blame Python and consider a rewrite (an earlier Tauri/Rust prototype lives in D:\Dev\Sv\Supervertaler2).

Before committing to a multi-month rewrite, the goal of this issue is to measure where the time actually goes and decide between three very different fix paths:

  1. Targeted PyQt6 / architecture tuning (cheapest, days of work)
  2. A small Rust extension via PyO3 for genuine inner-loop hot paths (weeks of work, keeps the rest of the codebase intact)
  3. Full rewrite in Tauri/Rust or native Rust GUI (months of work, last resort)

The hypothesis going in: most perceived slowness is architectural (Qt model layer over-calling, synchronous TM/term lookups on focus change, no virtualization, etc.) rather than Python-the-language. A profiling pass will tell us whether that's right.


One-time setup

cd D:\Dev\Sv\Supervertaler
.\.venv\Scripts\Activate.ps1
pip install py-spy snakeviz
  • py-spy — samples a running process, no code changes needed. Best for "the app is sluggish right now, what's it doing?"
  • snakeviz — flame-graph viewer for cProfile output. Best for analysing a specific operation after the fact.
  • cProfile — ships with Python, no install needed.

Test corpus

Need one realistic large project to profile against:

  • An SDLXLIFF or DOCX-derived project with 2,000–10,000 segments
  • A TM attached with at least 50k–500k TUs so lookups exercise the index
  • Optional: a MultiTerm .sdltb attached

If no real project is to hand, generate a synthetic one.


Scenarios to profile

Five scenarios, ordered from "almost certainly slow" to "probably fine but let's confirm":

1. Cold open of a large project

Most likely Python-startup + XML parsing dominated.

python -X importtime Supervertaler.py 2> import-time.log
# open the test project, close as soon as it's fully loaded

Also capture wall-clock: stopwatch from double-click to fully-loaded grid.

2. Scrolling through the grid

Most likely architectural (model overuse, no virtualization).

py-spy record -o scroll.svg --pid <workbench-pid> --duration 15
# scroll vigorously through the grid for ~15 seconds

3. Typing in a segment

Most likely signal-storm / re-render on every keystroke.

py-spy record -o typing.svg --pid <workbench-pid> --duration 10
# type a 100-char target translation at normal speed

4. Segment-to-segment navigation

Likely culprit: TM lookups, term highlighting, match panel updates firing synchronously on focus change.

py-spy record -o navigate.svg --pid <workbench-pid> --duration 15
# Down-arrow through 50 segments at normal review pace

5. Batch op (e.g. find/replace across all segments)

Most likely an inner-loop candidate for a Rust extension.

py-spy record -o batch.svg --pid <workbench-pid> --duration 30
# trigger the batch op

Output collection

Drop everything in D:\Dev\Sv\Supervertaler\.dev\profiling-<date>\:

  • import-time.log
  • scroll.svg, typing.svg, navigate.svg, batch.svg
  • Wall-clock numbers (plain text: "cold open: 8.2s", "type 100 chars: noticeable lag at ~30 chars in")
  • Short note on what felt slow during each scenario

Flame-graph SVGs are the gold here — they show exactly which functions ate the time.


Diagnosis matrix

For each flame graph, the fix shape follows directly from the pattern:

Pattern in the flame graph Diagnosis Fix
Most time in QAbstractTableModel.data() or paintEvent Grid architecture — over-calling the model PyQt6 tuning: cache, batch, virtualisation audit
Most time in Python for loops over segments Inner-loop hot path Small PyO3 Rust extension
Most time in requests / httpx / urllib / sqlite3 I/O or network bound Background thread, debounce, cache
Most time in xml / lxml / etree parsing File-parse cost lxml tuning, lazy loading, or Rust XML parser via PyO3
Time spread thinly across many functions, none dominant Genuinely systemic The rewrite case starts to firm up

Predicted outcome

Most likely: scenarios 1 and 5 will reveal Python/parsing bottlenecks fixable by a PyO3 extension in days, and scenarios 2–4 will reveal architectural issues in the Qt model layer that have nothing to do with Python being slow.

If that's wrong and everything is uniformly slow with no clear hot spots, that's when Tauri/Rust gets seriously reconsidered.


Time budget

  • Setup + test project: 30 min
  • Five profiling runs: 45 min
  • Analysis + first round of fixes: 1–2 hours
  • Decision: same day

Total: roughly a half-day. Cheap relative to a multi-month rewrite.


Definition of done

  • All five flame graphs captured and saved to the repo (or a gist).
  • A summary comment on this issue identifying the dominant bottleneck per scenario.
  • A decision recorded here: PyQt6 tuning / PyO3 extension / rewrite, with concrete next steps for whichever path is chosen.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions