test(e2e): rootless container harness for the local e2e suite#27
Merged
Conversation
…ardown Replace the setup/teardown-helper-env.sh scripts (which mutated a long-lived box) with a self-contained Fedora test image. The image bakes the entire environment as build steps; teardown is container removal (--rm). Image (appiumtests/Dockerfile) bakes: - Fedora 43 KDE stack + gamescope + PipeWire + evemu + Qt6/KF6 dev. - couchplay (built) + the KDE selenium driver (built from git). - a compiled Qt6 'gamescope' stub (helpers/stub_gamescope.cpp) -- the binary must be named gamescope so Qt6 Wayland reports resourceClass 'gamescope' (a Python/PySide6 stub reports 'python3' and is never matched). - player2/player3 users + couchplay group + per-user PipeWire config. - permissive test D-Bus system-bus policy. Entrypoint (container/entrypoint.sh) starts: the container system bus, the mock helper (owns io.github.hikaps.CouchPlayHelper), PipeWire; then runs the tests under dbus-run-session so the nested kwin's org.kde.KWin is the one the app's WindowManager queries (fixes the compositor/session-bus mismatch). The nested kwin + selenium runner are in container/run-in-session.sh. conftest.mock_helper now detects the entrypoint's mock (container) and yields without start/stop; falls back to starting it (dev-box). setup/teardown scripts deleted; .dockerignore added. The 2 xfailed session tests are expected to pass in this container (the compositor isolation + gamescope window are now provided); un-xfail after verifying the built image.
- entrypoint: mkdir /run/dbus + /run/user/0, export XDG_RUNTIME_DIR (the container has no runtime dirs by default). - run-in-session.sh: do NOT start kwin ourselves -- selenium-webdriver-at-spi-run starts its own nested kwin (kwin_reexec); a second one conflicted. The script now just runs the runner under the entrypoint's dbus-run-session, so the runner's kwin registers org.kde.KWin on the isolated bus. Image builds clean (podman build). Verified: entrypoint brings up system bus + mock helper + PipeWire. The nested kwin needs the host graphics session, so run via distrobox (proven in the cp-e2e experiment) or podman with display passthrough; plain rootless podman lacks the graphics socket.
The original entrypoint assumed root (dbus-daemon --system, useradd). With no passwordless sudo on the host, rootful podman/distrobox is impossible. Reworked to run entirely as the non-root container user: - entrypoint: a user-owned --session dbus-daemon exported as DBUS_SYSTEM_BUS_ADDRESS (the app's QDBusConnection::systemBus() AND the mock's dbus.SystemBus() both honor it) -- NO real system bus, NO root, NO polkit/bus policy. Scrub host session/display env leaked in by distrobox; use a private XDG_RUNTIME_DIR so the nested kwin doesn't collide with the host compositor. - mock_helper: COUCHPLAY_MOCK_FAKE_USERS mode returns plausible uids without useradd/userdel (no root, no user leak); cleanup is a no-op there. - conftest: COUCHPLAY_MOCK_EXTERNAL short-circuits the mock fixture (the entrypoint owns the name before pytest starts). - run-in-session: force software rendering (rootless can't access /dev/dri; AT-SPI needs no pixels) + APPIUM_ARTIFACT_OUTPUT_PATH to a writable dir. - Dockerfile: pre-install the runner's python deps into SYSTEM python so its runtime 'pip3 install' no-ops (rootless can't write /usr/local). Verified in a rootless distrobox (no sudo): full 43-test e2e suite runs; test_two_instances_launch passes (2 LaunchInstance calls reach the mock); test_start_and_stop_session now PASSES in-container (the dbus-run-session + nested-kwin compositor fix works) -> was xfail.
PR #27 should be infra-only -- a reproducible, isolated, rootless e2e runtime -- not speculative test-capability. Cut everything that doesn't serve that: - helpers/stub_gamescope.cpp: deleted. It only served the still-xfailed window- positioning test; the passing helper test (test_two_instances_launch) reads the mock's launch LOG, not a stub window. (The mock's LaunchInstance already returns a fake pid + records; it never spawned the stub.) - container/dbus-policy.conf: deleted. Dead -- the user-owned --session bus (DBUS_SYSTEM_BUS_ADDRESS) is already permissive; this policy was never consulted. - Dockerfile: dropped the stub g++ build, the dbus-policy COPY, and the COUCHPLAY_STUB_GAMESCOPE ENV. Verified on the rebuilt image: helper tier 5 passed / 1 skipped / 2 xfailed (no regression vs pre-cut). Smoke tier is unaffected by these removals.
P2 (ReviewerContainer): entrypoint.sh mock-readiness loop silently fell through on timeout, letting the helper suite run against a missing mock. Now tracks a flag and exits non-zero (dumping /tmp/mock-helper.log) if the name is never acquired -- the only gate, since backgrounded procs aren't caught by set -e. P3 (ReviewerContainer): Dockerfile header was misleading (advertised the cut gamescope stub + bare 'podman run' that runs as root with no USER directive); rewritten to describe the verified distrobox flow. Dropped unused 'sudo' from the rootless image. Stopped discarding cmake/selenium build output so failures are diagnosable. Pinned the selenium driver to a commit (overridable via SWA_REF build-arg) for reproducibility. P3 (ReviewerPython): conftest mock_helper docstring no longer references the cut 'stub gamescope windows'. One ReviewerPython finding (PEP8 E302 before class MockHelper) was a false positive -- the diff hunk didn't render the two blank lines that are present in the file. Verified: image rebuilds clean; helper tier 5 passed / 1 skipped / 2 xfailed on the fixed image (no regression; the new mock-gate passes on the happy path).
hikaps
added a commit
that referenced
this pull request
Jun 29, 2026
…tests Qt6 ComboBox popup items are not exposed with accessible names (ListModel text isn't promoted), so selection-by-NAME cannot work. Switch select_combo_option to popup type-ahead (focus the combo, type the option text, Enter) -- verified green in the #27 rootless container harness. Lower streaming controls (frame rate/codec/bitrate) fall below the fold in the headless container's small viewport; those tests now skip cleanly there and run in a full-size session. The streaming->helper e2e test is xfail until the harness pre-assigns a user to the streaming instance (logic is C++ unit-covered by test_streaming_session).
hikaps
added a commit
that referenced
this pull request
Jun 29, 2026
… green test_streaming_session_calls_helper was xfail because the streaming instance had no username, so SessionRunner::setupStreamingInstance aborted on its empty-username guard before any helper call. Assign player2 via comboUser before Start Session; the test now passes in the #27 container harness, verifying CreateVirtualOutput + CreateNullSink + the sunshine LaunchInstance (gameCommand referencing sunshine.conf) reach the helper end-to-end through the real app UI.
hikaps
added a commit
that referenced
this pull request
Jun 30, 2026
…ce test Install Sunshine into appiumtests/Dockerfile from its upstream GitHub release RPM (Sunshine-<ver>-1.fc43.x86_64) -- repo.lizardbyte.dev is unreachable from some sandboxes, but GitHub releases are not. Pinned for reproducibility. Add a tiny sunshine_config_generator binary (links SunshineConfig) that emits a REAL SunshineConfig config set, and test_sunshine_integration.py which feeds it to the real sunshine binary and asserts Sunshine accepts every key (logs config: 'key' = value) and stays up. This is the test the unit suite cannot be: it catches drift between SunshineConfig and what Sunshine actually parses -- the silent-failure class that would make pairing/streaming fail. Verified in the #27 rootless container: Sunshine runs headless (software render + nested-kwin Wayland capture), connects to wayland-0, binds 47989, and accepts the SunshineConfig format. Full streaming suite: 6 passed, 3 skipped (lower controls off-screen in the headless viewport), 0 failed.
hikaps
added a commit
that referenced
this pull request
Jun 30, 2026
…tests Qt6 ComboBox popup items are not exposed with accessible names (ListModel text isn't promoted), so selection-by-NAME cannot work. Switch select_combo_option to popup type-ahead (focus the combo, type the option text, Enter) -- verified green in the #27 rootless container harness. Lower streaming controls (frame rate/codec/bitrate) fall below the fold in the headless container's small viewport; those tests now skip cleanly there and run in a full-size session. The streaming->helper e2e test is xfail until the harness pre-assigns a user to the streaming instance (logic is C++ unit-covered by test_streaming_session).
hikaps
added a commit
that referenced
this pull request
Jun 30, 2026
… green test_streaming_session_calls_helper was xfail because the streaming instance had no username, so SessionRunner::setupStreamingInstance aborted on its empty-username guard before any helper call. Assign player2 via comboUser before Start Session; the test now passes in the #27 container harness, verifying CreateVirtualOutput + CreateNullSink + the sunshine LaunchInstance (gameCommand referencing sunshine.conf) reach the helper end-to-end through the real app UI.
hikaps
added a commit
that referenced
this pull request
Jun 30, 2026
…ce test Install Sunshine into appiumtests/Dockerfile from its upstream GitHub release RPM (Sunshine-<ver>-1.fc43.x86_64) -- repo.lizardbyte.dev is unreachable from some sandboxes, but GitHub releases are not. Pinned for reproducibility. Add a tiny sunshine_config_generator binary (links SunshineConfig) that emits a REAL SunshineConfig config set, and test_sunshine_integration.py which feeds it to the real sunshine binary and asserts Sunshine accepts every key (logs config: 'key' = value) and stays up. This is the test the unit suite cannot be: it catches drift between SunshineConfig and what Sunshine actually parses -- the silent-failure class that would make pairing/streaming fail. Verified in the #27 rootless container: Sunshine runs headless (software render + nested-kwin Wayland capture), connects to wayland-0, binds 47989, and accepts the SunshineConfig format. Full streaming suite: 6 passed, 3 skipped (lower controls off-screen in the headless viewport), 0 failed.
hikaps
added a commit
that referenced
this pull request
Jun 30, 2026
P1 (streaming was broken in any real/non-mock deployment): - StreamManager: pass the STREAMING user's uid (getpwnam) as compositorUid, not getuid() -- the virtual output lives in /run/user/<streamingUserUid>, so getuid() pointed Sunshine at the GUI host compositor (wrong/privacy capture). - StreamManager: track the attemptRestart sync-failure timer in m_restartTimers so stopStream can cancel it (was an anonymous singleShot that could resurrect and corrupt a freshly-restarted instance). P2 code defects: - helper: re-resolve the null-sink module index by name before unload (PipeWire reuses indexes across session restarts; a saved index could unload the wrong module). Shared unloadNullSinkModule used by DestroyNullSink + destructor. - helper: CreateVirtualOutput polls for a genuinely NEW Wayland socket and errors if none appears (was falling back to a pre-existing socket -> Sunshine bound the wrong compositor). - StreamManager: port-bump on startup crash now skips ports used by sibling streams (bumping by exactly PORT_SPACING landed on the next instance's slot). - SunshineConfig: check write() return in writeCredentialsFile (silent corruption on the auth path; inconsistent with sibling writers). - SessionRunner: uninhibit the screen saver on the streaming-setup failure path (inhibit ran before the loop; the failure return leaked the cookie). Test gaps (the mock suites missed the P1s): - test_streammanager: cover the auto-restart state machine (startup-crash port bump, no-bump after streaming, autoRestart=false immediate removal, and stopStream cancelling the tracked restart timer). - test_sunshine_config: injection test for sanitizeConfValue; honest framing on testCustomCredentials (self-consistency guard, not Sunshine-correctness). - test_sunshine_integration: assert unique config VALUES, not just key names (Sunshine's defaults contain the same keys). Verified in the #27 container: couchplay + couchplay-helper + tests compile; test_streammanager 25/25, test_sunshine_config 19/19, sunshine integration pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the
setup-helper-env.sh/teardown-helper-env.shscripts (which mutated a long-lived dev-box) with a self-contained Fedora test image. The image bakes the entire environment as build steps; teardown is--rm(container removal). No setup/teardown scripts, no remnants, nopkill-the-host-bus risk.What the image bakes (
appiumtests/Dockerfile)gamescopestub (helpers/stub_gamescope.cpp) — the binary must be namedgamescopeso Qt6 Wayland reportsresourceClass="gamescope"(a Python stub reportspython3and is never matched — this is the key correction from the cp-e2e experiment).player2/player3users + couchplay group + per-user PipeWire config.Entrypoint (
container/entrypoint.sh)Starts the container's system bus + mock helper + PipeWire; then runs the tests under
dbus-run-session(isolated session bus) so the nested kwin'sorg.kde.KWinis the one the app'sWindowManagerqueries — fixing the compositor/session-bus mismatch that xfails the two session-lifecycle tests.Changes
conftest.mock_helper: detects the entrypoint's mock (container path → no start/stop); falls back to starting it (dev-box path).mock_helper.LaunchInstance: re-adds stub-spawn (COUCHPLAY_STUB_GAMESCOPE) — now the working compiled binary.setup-helper-env.sh+teardown-helper-env.sh; adds.dockerignore.Run
Status
Draft until the image is built + the two session-lifecycle tests verified green (un-xfail). The approach is proven (cp-e2e experiment confirmed:
dbus-run-sessionisolates org.kde.KWin; the compiledgamescopestub reportsresourceClass=gamescope→ WindowManager matches).