Skip to content

Flaky test: "saveFlushCompleted fires after adding a trip" races on cross-thread signal delivery #520

@vpicaver

Description

@vpicaver

Summary

saveFlushCompleted fires after adding a trip (testcases/test_cwSaveLoad.cpp:2364) fails intermittently with flushSpy.count() >= 1 returning 0 >= 1. Reproduces at roughly 1/20 runs (5%) on a macOS Debug build with ASan.

The same flake rate reproduces on a clean baseline (no local changes), so this is not regression fallout from any recent work — it's a pre-existing race in the test's synchronization contract.

testcases/test_cwSaveLoad.cpp:2383: FAILED:
  CHECK( flushSpy.count() >= 1 )
with expansion:
  0 >= 1

Root cause

cwProject::saveFlushCompleted is a re-emit across thread boundaries:

  1. cwSaveLoad::saveFlushCompleted is emitted on the save-thread (cwSaveLoad runs on a single-threaded QThreadPool, cwSaveLoadPrivate.cpp setMaxThreadCount(1)).
  2. cwProject.cpp:232-237 forwards it via a queued connection (the lambda's context is cwProject, which lives on the main thread):
    connect(saveLoad, &cwSaveLoad::saveFlushCompleted, this, [this, saveLoad]() {
        if (m_saveLoad != saveLoad) return;
        emit saveFlushCompleted();
    });
  3. The test calls project->waitSaveToFinish() which ends with AsyncFuture::waitForFinished(SaveFuture) (cwProject.cpp:1392-1402). That spins a nested QEventLoop and returns once SaveFuture finishes.
  4. Race: the underlying cwSaveLoad worker emits saveFlushCompleted from a path that is timing-independent of SaveFuture completion. If the spin loop exits before the queued cross-thread signal is delivered to the main thread, the QSignalSpy records zero hits.

This is the kind of AsyncFuture::waitForFinished event-loop fragility CLAUDE.md already warns about, surfacing here as a synchronization gap between the future and the cross-thread signal.

Reproduction

ASAN_OPTIONS=detect_container_overflow=0 \
  ./build/<preset>/cavewhere-test "saveFlushCompleted fires after adding a trip"

Or, to measure the flake rate:

PASS=0; FAIL=0
for i in $(seq 1 20); do
  R=$(ASAN_OPTIONS=detect_container_overflow=0 \
        ./build/<preset>/cavewhere-test "saveFlushCompleted fires after adding a trip" 2>&1 | tail -3)
  if echo "$R" | grep -q "All tests passed"; then PASS=$((PASS+1)); else FAIL=$((FAIL+1)); fi
done
echo "pass=$PASS fail=$FAIL"

Both before and after the in-progress Phase 1 lineplot-async port: 19 pass / 1 fail out of 20 (same on both runs).

The sibling test saveFlushCompleted fires after modifying a trip (line 2391) uses the same pattern and likely has the same latent race; it just hasn't been observed failing yet.

Suggested fixes (pick one)

  1. Test-side (cheapest): make the assertion poll-with-timeout instead of expecting the spy count to be settled the moment waitSaveToFinish() returns. Process events for up to ~1s while waiting for flushSpy.count() >= 1.

  2. Library-side (correct fix): give cwProject::waitSaveToFinish() a stronger ordering guarantee so callers don't need to know about the cross-thread re-emit. Easiest: drain the main-thread event queue (QCoreApplication::processEvents()) at the end of waitSaveToFinish() so any queued saveFlushCompleted re-emit is delivered before the function returns.

Option 2 protects every other caller from the same race (including the sibling test) and removes a footgun from the test-only API contract.

History

Test added in commit 69e56bd6 (Detect uncommitted changes on GitHistoryPage via saveFlushCompleted, 2026-04-01). Has been latently flaky since.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Bug.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions