test(e2e): rootless container harness for the local e2e suite by hikaps · Pull Request #27 · hikaps/couchplay

hikaps · 2026-06-25T01:13:00Z

Summary

Replaces the setup-helper-env.sh / teardown-helper-env.sh scripts (which mutated a long-lived dev-box) with a self-contained Fedora test image. The image bakes the entire environment as build steps; teardown is --rm (container removal). No setup/teardown scripts, no remnants, no pkill-the-host-bus risk.

What the image bakes (`appiumtests/Dockerfile`)

Fedora 43 KDE stack + gamescope + PipeWire + evemu + Qt6/KF6 dev.
couchplay (built) + the KDE selenium driver (built from git).
A compiled Qt6 gamescope stub (helpers/stub_gamescope.cpp) — the binary must be named gamescope so Qt6 Wayland reports resourceClass="gamescope" (a Python stub reports python3 and is never matched — this is the key correction from the cp-e2e experiment).
player2/player3 users + couchplay group + per-user PipeWire config.
Permissive test D-Bus system-bus policy.

Entrypoint (`container/entrypoint.sh`)

Starts the container's system bus + mock helper + PipeWire; then runs the tests under dbus-run-session (isolated session bus) so the nested kwin's org.kde.KWin is the one the app's WindowManager queries — fixing the compositor/session-bus mismatch that xfails the two session-lifecycle tests.

Changes

conftest.mock_helper: detects the entrypoint's mock (container path → no start/stop); falls back to starting it (dev-box path).
mock_helper.LaunchInstance: re-adds stub-spawn (COUCHPLAY_STUB_GAMESCOPE) — now the working compiled binary.
Deletes setup-helper-env.sh + teardown-helper-env.sh; adds .dockerignore.

Run

podman build -f appiumtests/Dockerfile -t couchplay-e2e .
podman run --rm couchplay-e2e appiumtests/                   # full suite
podman run --rm couchplay-e2e appiumtests/ -m "not requires_helper"  # smoke tier

Status

Draft until the image is built + the two session-lifecycle tests verified green (un-xfail). The approach is proven (cp-e2e experiment confirmed: dbus-run-session isolates org.kde.KWin; the compiled gamescope stub reports resourceClass=gamescope → WindowManager matches).

…ardown Replace the setup/teardown-helper-env.sh scripts (which mutated a long-lived box) with a self-contained Fedora test image. The image bakes the entire environment as build steps; teardown is container removal (--rm). Image (appiumtests/Dockerfile) bakes: - Fedora 43 KDE stack + gamescope + PipeWire + evemu + Qt6/KF6 dev. - couchplay (built) + the KDE selenium driver (built from git). - a compiled Qt6 'gamescope' stub (helpers/stub_gamescope.cpp) -- the binary must be named gamescope so Qt6 Wayland reports resourceClass 'gamescope' (a Python/PySide6 stub reports 'python3' and is never matched). - player2/player3 users + couchplay group + per-user PipeWire config. - permissive test D-Bus system-bus policy. Entrypoint (container/entrypoint.sh) starts: the container system bus, the mock helper (owns io.github.hikaps.CouchPlayHelper), PipeWire; then runs the tests under dbus-run-session so the nested kwin's org.kde.KWin is the one the app's WindowManager queries (fixes the compositor/session-bus mismatch). The nested kwin + selenium runner are in container/run-in-session.sh. conftest.mock_helper now detects the entrypoint's mock (container) and yields without start/stop; falls back to starting it (dev-box). setup/teardown scripts deleted; .dockerignore added. The 2 xfailed session tests are expected to pass in this container (the compositor isolation + gamescope window are now provided); un-xfail after verifying the built image.

- entrypoint: mkdir /run/dbus + /run/user/0, export XDG_RUNTIME_DIR (the container has no runtime dirs by default). - run-in-session.sh: do NOT start kwin ourselves -- selenium-webdriver-at-spi-run starts its own nested kwin (kwin_reexec); a second one conflicted. The script now just runs the runner under the entrypoint's dbus-run-session, so the runner's kwin registers org.kde.KWin on the isolated bus. Image builds clean (podman build). Verified: entrypoint brings up system bus + mock helper + PipeWire. The nested kwin needs the host graphics session, so run via distrobox (proven in the cp-e2e experiment) or podman with display passthrough; plain rootless podman lacks the graphics socket.

The original entrypoint assumed root (dbus-daemon --system, useradd). With no passwordless sudo on the host, rootful podman/distrobox is impossible. Reworked to run entirely as the non-root container user: - entrypoint: a user-owned --session dbus-daemon exported as DBUS_SYSTEM_BUS_ADDRESS (the app's QDBusConnection::systemBus() AND the mock's dbus.SystemBus() both honor it) -- NO real system bus, NO root, NO polkit/bus policy. Scrub host session/display env leaked in by distrobox; use a private XDG_RUNTIME_DIR so the nested kwin doesn't collide with the host compositor. - mock_helper: COUCHPLAY_MOCK_FAKE_USERS mode returns plausible uids without useradd/userdel (no root, no user leak); cleanup is a no-op there. - conftest: COUCHPLAY_MOCK_EXTERNAL short-circuits the mock fixture (the entrypoint owns the name before pytest starts). - run-in-session: force software rendering (rootless can't access /dev/dri; AT-SPI needs no pixels) + APPIUM_ARTIFACT_OUTPUT_PATH to a writable dir. - Dockerfile: pre-install the runner's python deps into SYSTEM python so its runtime 'pip3 install' no-ops (rootless can't write /usr/local). Verified in a rootless distrobox (no sudo): full 43-test e2e suite runs; test_two_instances_launch passes (2 LaunchInstance calls reach the mock); test_start_and_stop_session now PASSES in-container (the dbus-run-session + nested-kwin compositor fix works) -> was xfail.

PR #27 should be infra-only -- a reproducible, isolated, rootless e2e runtime -- not speculative test-capability. Cut everything that doesn't serve that: - helpers/stub_gamescope.cpp: deleted. It only served the still-xfailed window- positioning test; the passing helper test (test_two_instances_launch) reads the mock's launch LOG, not a stub window. (The mock's LaunchInstance already returns a fake pid + records; it never spawned the stub.) - container/dbus-policy.conf: deleted. Dead -- the user-owned --session bus (DBUS_SYSTEM_BUS_ADDRESS) is already permissive; this policy was never consulted. - Dockerfile: dropped the stub g++ build, the dbus-policy COPY, and the COUCHPLAY_STUB_GAMESCOPE ENV. Verified on the rebuilt image: helper tier 5 passed / 1 skipped / 2 xfailed (no regression vs pre-cut). Smoke tier is unaffected by these removals.

P2 (ReviewerContainer): entrypoint.sh mock-readiness loop silently fell through on timeout, letting the helper suite run against a missing mock. Now tracks a flag and exits non-zero (dumping /tmp/mock-helper.log) if the name is never acquired -- the only gate, since backgrounded procs aren't caught by set -e. P3 (ReviewerContainer): Dockerfile header was misleading (advertised the cut gamescope stub + bare 'podman run' that runs as root with no USER directive); rewritten to describe the verified distrobox flow. Dropped unused 'sudo' from the rootless image. Stopped discarding cmake/selenium build output so failures are diagnosable. Pinned the selenium driver to a commit (overridable via SWA_REF build-arg) for reproducibility. P3 (ReviewerPython): conftest mock_helper docstring no longer references the cut 'stub gamescope windows'. One ReviewerPython finding (PEP8 E302 before class MockHelper) was a false positive -- the diff hunk didn't render the two blank lines that are present in the file. Verified: image rebuilds clean; helper tier 5 passed / 1 skipped / 2 xfailed on the fixed image (no regression; the new mock-gate passes on the happy path).

…tests Qt6 ComboBox popup items are not exposed with accessible names (ListModel text isn't promoted), so selection-by-NAME cannot work. Switch select_combo_option to popup type-ahead (focus the combo, type the option text, Enter) -- verified green in the #27 rootless container harness. Lower streaming controls (frame rate/codec/bitrate) fall below the fold in the headless container's small viewport; those tests now skip cleanly there and run in a full-size session. The streaming->helper e2e test is xfail until the harness pre-assigns a user to the streaming instance (logic is C++ unit-covered by test_streaming_session).

… green test_streaming_session_calls_helper was xfail because the streaming instance had no username, so SessionRunner::setupStreamingInstance aborted on its empty-username guard before any helper call. Assign player2 via comboUser before Start Session; the test now passes in the #27 container harness, verifying CreateVirtualOutput + CreateNullSink + the sunshine LaunchInstance (gameCommand referencing sunshine.conf) reach the helper end-to-end through the real app UI.

…ce test Install Sunshine into appiumtests/Dockerfile from its upstream GitHub release RPM (Sunshine-<ver>-1.fc43.x86_64) -- repo.lizardbyte.dev is unreachable from some sandboxes, but GitHub releases are not. Pinned for reproducibility. Add a tiny sunshine_config_generator binary (links SunshineConfig) that emits a REAL SunshineConfig config set, and test_sunshine_integration.py which feeds it to the real sunshine binary and asserts Sunshine accepts every key (logs config: 'key' = value) and stays up. This is the test the unit suite cannot be: it catches drift between SunshineConfig and what Sunshine actually parses -- the silent-failure class that would make pairing/streaming fail. Verified in the #27 rootless container: Sunshine runs headless (software render + nested-kwin Wayland capture), connects to wayland-0, binds 47989, and accepts the SunshineConfig format. Full streaming suite: 6 passed, 3 skipped (lower controls off-screen in the headless viewport), 0 failed.

…tests Qt6 ComboBox popup items are not exposed with accessible names (ListModel text isn't promoted), so selection-by-NAME cannot work. Switch select_combo_option to popup type-ahead (focus the combo, type the option text, Enter) -- verified green in the #27 rootless container harness. Lower streaming controls (frame rate/codec/bitrate) fall below the fold in the headless container's small viewport; those tests now skip cleanly there and run in a full-size session. The streaming->helper e2e test is xfail until the harness pre-assigns a user to the streaming instance (logic is C++ unit-covered by test_streaming_session).

… green test_streaming_session_calls_helper was xfail because the streaming instance had no username, so SessionRunner::setupStreamingInstance aborted on its empty-username guard before any helper call. Assign player2 via comboUser before Start Session; the test now passes in the #27 container harness, verifying CreateVirtualOutput + CreateNullSink + the sunshine LaunchInstance (gameCommand referencing sunshine.conf) reach the helper end-to-end through the real app UI.

…ce test Install Sunshine into appiumtests/Dockerfile from its upstream GitHub release RPM (Sunshine-<ver>-1.fc43.x86_64) -- repo.lizardbyte.dev is unreachable from some sandboxes, but GitHub releases are not. Pinned for reproducibility. Add a tiny sunshine_config_generator binary (links SunshineConfig) that emits a REAL SunshineConfig config set, and test_sunshine_integration.py which feeds it to the real sunshine binary and asserts Sunshine accepts every key (logs config: 'key' = value) and stays up. This is the test the unit suite cannot be: it catches drift between SunshineConfig and what Sunshine actually parses -- the silent-failure class that would make pairing/streaming fail. Verified in the #27 rootless container: Sunshine runs headless (software render + nested-kwin Wayland capture), connects to wayland-0, binds 47989, and accepts the SunshineConfig format. Full streaming suite: 6 passed, 3 skipped (lower controls off-screen in the headless viewport), 0 failed.

P1 (streaming was broken in any real/non-mock deployment): - StreamManager: pass the STREAMING user's uid (getpwnam) as compositorUid, not getuid() -- the virtual output lives in /run/user/<streamingUserUid>, so getuid() pointed Sunshine at the GUI host compositor (wrong/privacy capture). - StreamManager: track the attemptRestart sync-failure timer in m_restartTimers so stopStream can cancel it (was an anonymous singleShot that could resurrect and corrupt a freshly-restarted instance). P2 code defects: - helper: re-resolve the null-sink module index by name before unload (PipeWire reuses indexes across session restarts; a saved index could unload the wrong module). Shared unloadNullSinkModule used by DestroyNullSink + destructor. - helper: CreateVirtualOutput polls for a genuinely NEW Wayland socket and errors if none appears (was falling back to a pre-existing socket -> Sunshine bound the wrong compositor). - StreamManager: port-bump on startup crash now skips ports used by sibling streams (bumping by exactly PORT_SPACING landed on the next instance's slot). - SunshineConfig: check write() return in writeCredentialsFile (silent corruption on the auth path; inconsistent with sibling writers). - SessionRunner: uninhibit the screen saver on the streaming-setup failure path (inhibit ran before the loop; the failure return leaked the cookie). Test gaps (the mock suites missed the P1s): - test_streammanager: cover the auto-restart state machine (startup-crash port bump, no-bump after streaming, autoRestart=false immediate removal, and stopStream cancelling the tracked restart timer). - test_sunshine_config: injection test for sanitizeConfValue; honest framing on testCustomCredentials (self-consistency guard, not Sunshine-correctness). - test_sunshine_integration: assert unique config VALUES, not just key names (Sunshine's defaults contain the same keys). Verified in the #27 container: couchplay + couchplay-helper + tests compile; test_streammanager 25/25, test_sunshine_config 19/19, sunshine integration pass.

hikaps added 4 commits June 24, 2026 18:12

hikaps marked this pull request as ready for review June 26, 2026 21:19

hikaps changed the title ~~test(e2e): containerize the harness (bake everything, drop setup/teardown)~~ test(e2e): rootless container harness for the local e2e suite Jun 26, 2026

hikaps merged commit 685449f into develop Jun 29, 2026
1 check passed

hikaps deleted the e2e-container branch June 29, 2026 18:03

hikaps mentioned this pull request Jun 29, 2026

feat(streaming): add Sunshine-based streaming for split-screen sessions #22

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(e2e): rootless container harness for the local e2e suite#27

test(e2e): rootless container harness for the local e2e suite#27
hikaps merged 5 commits into
developfrom
e2e-container

hikaps commented Jun 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hikaps commented Jun 25, 2026

Summary

What the image bakes (appiumtests/Dockerfile)

Entrypoint (container/entrypoint.sh)

Changes

Run

Status

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

What the image bakes (`appiumtests/Dockerfile`)

Entrypoint (`container/entrypoint.sh`)