Skip to content

test(e2e): rootless container harness for the local e2e suite#27

Merged
hikaps merged 5 commits into
developfrom
e2e-container
Jun 29, 2026
Merged

test(e2e): rootless container harness for the local e2e suite#27
hikaps merged 5 commits into
developfrom
e2e-container

Conversation

@hikaps

@hikaps hikaps commented Jun 25, 2026

Copy link
Copy Markdown
Owner

Summary

Replaces the setup-helper-env.sh / teardown-helper-env.sh scripts (which mutated a long-lived dev-box) with a self-contained Fedora test image. The image bakes the entire environment as build steps; teardown is --rm (container removal). No setup/teardown scripts, no remnants, no pkill-the-host-bus risk.

What the image bakes (appiumtests/Dockerfile)

  • Fedora 43 KDE stack + gamescope + PipeWire + evemu + Qt6/KF6 dev.
  • couchplay (built) + the KDE selenium driver (built from git).
  • A compiled Qt6 gamescope stub (helpers/stub_gamescope.cpp) — the binary must be named gamescope so Qt6 Wayland reports resourceClass="gamescope" (a Python stub reports python3 and is never matched — this is the key correction from the cp-e2e experiment).
  • player2/player3 users + couchplay group + per-user PipeWire config.
  • Permissive test D-Bus system-bus policy.

Entrypoint (container/entrypoint.sh)

Starts the container's system bus + mock helper + PipeWire; then runs the tests under dbus-run-session (isolated session bus) so the nested kwin's org.kde.KWin is the one the app's WindowManager queries — fixing the compositor/session-bus mismatch that xfails the two session-lifecycle tests.

Changes

  • conftest.mock_helper: detects the entrypoint's mock (container path → no start/stop); falls back to starting it (dev-box path).
  • mock_helper.LaunchInstance: re-adds stub-spawn (COUCHPLAY_STUB_GAMESCOPE) — now the working compiled binary.
  • Deletes setup-helper-env.sh + teardown-helper-env.sh; adds .dockerignore.

Run

podman build -f appiumtests/Dockerfile -t couchplay-e2e .
podman run --rm couchplay-e2e appiumtests/                   # full suite
podman run --rm couchplay-e2e appiumtests/ -m "not requires_helper"  # smoke tier

Status

Draft until the image is built + the two session-lifecycle tests verified green (un-xfail). The approach is proven (cp-e2e experiment confirmed: dbus-run-session isolates org.kde.KWin; the compiled gamescope stub reports resourceClass=gamescope → WindowManager matches).

hikaps added 4 commits June 24, 2026 18:12
…ardown

Replace the setup/teardown-helper-env.sh scripts (which mutated a long-lived
box) with a self-contained Fedora test image. The image bakes the entire
environment as build steps; teardown is container removal (--rm).

Image (appiumtests/Dockerfile) bakes:
- Fedora 43 KDE stack + gamescope + PipeWire + evemu + Qt6/KF6 dev.
- couchplay (built) + the KDE selenium driver (built from git).
- a compiled Qt6 'gamescope' stub (helpers/stub_gamescope.cpp) -- the binary
  must be named gamescope so Qt6 Wayland reports resourceClass 'gamescope'
  (a Python/PySide6 stub reports 'python3' and is never matched).
- player2/player3 users + couchplay group + per-user PipeWire config.
- permissive test D-Bus system-bus policy.

Entrypoint (container/entrypoint.sh) starts: the container system bus, the mock
helper (owns io.github.hikaps.CouchPlayHelper), PipeWire; then runs the tests
under dbus-run-session so the nested kwin's org.kde.KWin is the one the app's
WindowManager queries (fixes the compositor/session-bus mismatch). The nested
kwin + selenium runner are in container/run-in-session.sh.

conftest.mock_helper now detects the entrypoint's mock (container) and yields
without start/stop; falls back to starting it (dev-box). setup/teardown scripts
deleted; .dockerignore added.

The 2 xfailed session tests are expected to pass in this container (the
compositor isolation + gamescope window are now provided); un-xfail after
verifying the built image.
- entrypoint: mkdir /run/dbus + /run/user/0, export XDG_RUNTIME_DIR (the
  container has no runtime dirs by default).
- run-in-session.sh: do NOT start kwin ourselves -- selenium-webdriver-at-spi-run
  starts its own nested kwin (kwin_reexec); a second one conflicted. The script
  now just runs the runner under the entrypoint's dbus-run-session, so the
  runner's kwin registers org.kde.KWin on the isolated bus.

Image builds clean (podman build). Verified: entrypoint brings up system bus +
mock helper + PipeWire. The nested kwin needs the host graphics session, so run
via distrobox (proven in the cp-e2e experiment) or podman with display passthrough;
plain rootless podman lacks the graphics socket.
The original entrypoint assumed root (dbus-daemon --system, useradd). With no
passwordless sudo on the host, rootful podman/distrobox is impossible. Reworked
to run entirely as the non-root container user:

- entrypoint: a user-owned --session dbus-daemon exported as
  DBUS_SYSTEM_BUS_ADDRESS (the app's QDBusConnection::systemBus() AND the mock's
  dbus.SystemBus() both honor it) -- NO real system bus, NO root, NO polkit/bus
  policy. Scrub host session/display env leaked in by distrobox; use a private
  XDG_RUNTIME_DIR so the nested kwin doesn't collide with the host compositor.
- mock_helper: COUCHPLAY_MOCK_FAKE_USERS mode returns plausible uids without
  useradd/userdel (no root, no user leak); cleanup is a no-op there.
- conftest: COUCHPLAY_MOCK_EXTERNAL short-circuits the mock fixture (the
  entrypoint owns the name before pytest starts).
- run-in-session: force software rendering (rootless can't access /dev/dri;
  AT-SPI needs no pixels) + APPIUM_ARTIFACT_OUTPUT_PATH to a writable dir.
- Dockerfile: pre-install the runner's python deps into SYSTEM python so its
  runtime 'pip3 install' no-ops (rootless can't write /usr/local).

Verified in a rootless distrobox (no sudo): full 43-test e2e suite runs;
test_two_instances_launch passes (2 LaunchInstance calls reach the mock);
test_start_and_stop_session now PASSES in-container (the dbus-run-session +
nested-kwin compositor fix works) -> was xfail.
PR #27 should be infra-only -- a reproducible, isolated, rootless e2e runtime --
not speculative test-capability. Cut everything that doesn't serve that:

- helpers/stub_gamescope.cpp: deleted. It only served the still-xfailed window-
  positioning test; the passing helper test (test_two_instances_launch) reads the
  mock's launch LOG, not a stub window. (The mock's LaunchInstance already returns
  a fake pid + records; it never spawned the stub.)
- container/dbus-policy.conf: deleted. Dead -- the user-owned --session bus
  (DBUS_SYSTEM_BUS_ADDRESS) is already permissive; this policy was never consulted.
- Dockerfile: dropped the stub g++ build, the dbus-policy COPY, and the
  COUCHPLAY_STUB_GAMESCOPE ENV.

Verified on the rebuilt image: helper tier 5 passed / 1 skipped / 2 xfailed
(no regression vs pre-cut). Smoke tier is unaffected by these removals.
@hikaps hikaps marked this pull request as ready for review June 26, 2026 21:19
@hikaps hikaps changed the title test(e2e): containerize the harness (bake everything, drop setup/teardown) test(e2e): rootless container harness for the local e2e suite Jun 26, 2026
P2 (ReviewerContainer): entrypoint.sh mock-readiness loop silently fell through
on timeout, letting the helper suite run against a missing mock. Now tracks a
flag and exits non-zero (dumping /tmp/mock-helper.log) if the name is never
acquired -- the only gate, since backgrounded procs aren't caught by set -e.

P3 (ReviewerContainer): Dockerfile header was misleading (advertised the cut
gamescope stub + bare 'podman run' that runs as root with no USER directive);
rewritten to describe the verified distrobox flow. Dropped unused 'sudo' from the
rootless image. Stopped discarding cmake/selenium build output so failures are
diagnosable. Pinned the selenium driver to a commit (overridable via SWA_REF
build-arg) for reproducibility.

P3 (ReviewerPython): conftest mock_helper docstring no longer references the cut
'stub gamescope windows'.

One ReviewerPython finding (PEP8 E302 before class MockHelper) was a false
positive -- the diff hunk didn't render the two blank lines that are present in
the file.

Verified: image rebuilds clean; helper tier 5 passed / 1 skipped / 2 xfailed on
the fixed image (no regression; the new mock-gate passes on the happy path).
@hikaps hikaps merged commit 685449f into develop Jun 29, 2026
1 check passed
@hikaps hikaps deleted the e2e-container branch June 29, 2026 18:03
hikaps added a commit that referenced this pull request Jun 29, 2026
…tests

Qt6 ComboBox popup items are not exposed with accessible names (ListModel text
isn't promoted), so selection-by-NAME cannot work. Switch select_combo_option
to popup type-ahead (focus the combo, type the option text, Enter) -- verified
green in the #27 rootless container harness.

Lower streaming controls (frame rate/codec/bitrate) fall below the fold in the
headless container's small viewport; those tests now skip cleanly there and run
in a full-size session. The streaming->helper e2e test is xfail until the
harness pre-assigns a user to the streaming instance (logic is C++ unit-covered
by test_streaming_session).
hikaps added a commit that referenced this pull request Jun 29, 2026
… green

test_streaming_session_calls_helper was xfail because the streaming instance
had no username, so SessionRunner::setupStreamingInstance aborted on its
empty-username guard before any helper call. Assign player2 via comboUser
before Start Session; the test now passes in the #27 container harness,
verifying CreateVirtualOutput + CreateNullSink + the sunshine LaunchInstance
(gameCommand referencing sunshine.conf) reach the helper end-to-end through
the real app UI.
hikaps added a commit that referenced this pull request Jun 30, 2026
…ce test

Install Sunshine into appiumtests/Dockerfile from its upstream GitHub release
RPM (Sunshine-<ver>-1.fc43.x86_64) -- repo.lizardbyte.dev is unreachable from
some sandboxes, but GitHub releases are not. Pinned for reproducibility.

Add a tiny sunshine_config_generator binary (links SunshineConfig) that emits a
REAL SunshineConfig config set, and test_sunshine_integration.py which feeds it
to the real sunshine binary and asserts Sunshine accepts every key
(logs config: 'key' = value) and stays up. This is the test the unit suite
cannot be: it catches drift between SunshineConfig and what Sunshine actually
parses -- the silent-failure class that would make pairing/streaming fail.

Verified in the #27 rootless container: Sunshine runs headless (software
render + nested-kwin Wayland capture), connects to wayland-0, binds 47989, and
accepts the SunshineConfig format. Full streaming suite: 6 passed, 3 skipped
(lower controls off-screen in the headless viewport), 0 failed.
hikaps added a commit that referenced this pull request Jun 30, 2026
…tests

Qt6 ComboBox popup items are not exposed with accessible names (ListModel text
isn't promoted), so selection-by-NAME cannot work. Switch select_combo_option
to popup type-ahead (focus the combo, type the option text, Enter) -- verified
green in the #27 rootless container harness.

Lower streaming controls (frame rate/codec/bitrate) fall below the fold in the
headless container's small viewport; those tests now skip cleanly there and run
in a full-size session. The streaming->helper e2e test is xfail until the
harness pre-assigns a user to the streaming instance (logic is C++ unit-covered
by test_streaming_session).
hikaps added a commit that referenced this pull request Jun 30, 2026
… green

test_streaming_session_calls_helper was xfail because the streaming instance
had no username, so SessionRunner::setupStreamingInstance aborted on its
empty-username guard before any helper call. Assign player2 via comboUser
before Start Session; the test now passes in the #27 container harness,
verifying CreateVirtualOutput + CreateNullSink + the sunshine LaunchInstance
(gameCommand referencing sunshine.conf) reach the helper end-to-end through
the real app UI.
hikaps added a commit that referenced this pull request Jun 30, 2026
…ce test

Install Sunshine into appiumtests/Dockerfile from its upstream GitHub release
RPM (Sunshine-<ver>-1.fc43.x86_64) -- repo.lizardbyte.dev is unreachable from
some sandboxes, but GitHub releases are not. Pinned for reproducibility.

Add a tiny sunshine_config_generator binary (links SunshineConfig) that emits a
REAL SunshineConfig config set, and test_sunshine_integration.py which feeds it
to the real sunshine binary and asserts Sunshine accepts every key
(logs config: 'key' = value) and stays up. This is the test the unit suite
cannot be: it catches drift between SunshineConfig and what Sunshine actually
parses -- the silent-failure class that would make pairing/streaming fail.

Verified in the #27 rootless container: Sunshine runs headless (software
render + nested-kwin Wayland capture), connects to wayland-0, binds 47989, and
accepts the SunshineConfig format. Full streaming suite: 6 passed, 3 skipped
(lower controls off-screen in the headless viewport), 0 failed.
hikaps added a commit that referenced this pull request Jun 30, 2026
P1 (streaming was broken in any real/non-mock deployment):
- StreamManager: pass the STREAMING user's uid (getpwnam) as compositorUid,
  not getuid() -- the virtual output lives in /run/user/<streamingUserUid>, so
  getuid() pointed Sunshine at the GUI host compositor (wrong/privacy capture).
- StreamManager: track the attemptRestart sync-failure timer in m_restartTimers
  so stopStream can cancel it (was an anonymous singleShot that could resurrect
  and corrupt a freshly-restarted instance).

P2 code defects:
- helper: re-resolve the null-sink module index by name before unload (PipeWire
  reuses indexes across session restarts; a saved index could unload the wrong
  module). Shared unloadNullSinkModule used by DestroyNullSink + destructor.
- helper: CreateVirtualOutput polls for a genuinely NEW Wayland socket and
  errors if none appears (was falling back to a pre-existing socket -> Sunshine
  bound the wrong compositor).
- StreamManager: port-bump on startup crash now skips ports used by sibling
  streams (bumping by exactly PORT_SPACING landed on the next instance's slot).
- SunshineConfig: check write() return in writeCredentialsFile (silent
  corruption on the auth path; inconsistent with sibling writers).
- SessionRunner: uninhibit the screen saver on the streaming-setup failure path
  (inhibit ran before the loop; the failure return leaked the cookie).

Test gaps (the mock suites missed the P1s):
- test_streammanager: cover the auto-restart state machine (startup-crash port
  bump, no-bump after streaming, autoRestart=false immediate removal, and
  stopStream cancelling the tracked restart timer).
- test_sunshine_config: injection test for sanitizeConfValue; honest framing on
  testCustomCredentials (self-consistency guard, not Sunshine-correctness).
- test_sunshine_integration: assert unique config VALUES, not just key names
  (Sunshine's defaults contain the same keys).

Verified in the #27 container: couchplay + couchplay-helper + tests compile;
test_streammanager 25/25, test_sunshine_config 19/19, sunshine integration pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant