fix(hw): read live DUT IP for sd-autoboot boards + surface URI-resolve errors#109
Merged
Conversation
sd-autoboot boards (and newer kernels) name the DUT NIC end0/enp*s0, so `ip -o addr show eth0` returned nothing and resolve_uri wrongly fell back to the stale static NetworkService.address (a shared placeholder), pointing the iiod check at the wrong IP. Read the first global-scope IPv4 instead, and log the previously-swallowed shell error on fallback so misses are diagnosable. Claude-Session: https://claude.ai/code/session_01BuMAqiic68LrMr6wWC7NCe
sd-autoboot boards DHCP a fresh random-MAC IP each boot, so the exporter's static NetworkService.address is stale; the request-side URI resolver can't re-acquire a shell on these targets (no ShellDriver found) and falls back to that wrong address, sending the iiod check to the wrong host. After boot, read the live global-scope IPv4 over the strategy's own already-active shell and write it onto NetworkService.address. Best-effort + guarded, so boards whose static address is already correct (bq/mini2) are unaffected. Claude-Session: https://claude.ai/code/session_01BuMAqiic68LrMr6wWC7NCe
…kService.address Claude-Session: https://claude.ai/code/session_01BuMAqiic68LrMr6wWC7NCe
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Two related improvements to how a booted board's libIIO URI is resolved, found while diagnosing the daily infra job's
nemofailure.request/uri.py:resolve_urinow reads the DUT's global-scope IPv4 (not a hardcodedeth0, which newer kernels rename toend0/enp*s0) and logs the previously-swallowed shell error on fallback. The silentexcept Exception: passhad hidden every failure reason.BootFPGASoCTFTP: after boot, poll the live DUT IP over the strategy's own working shell and write it ontoNetworkService.address, so the URI resolver targets the board's real DHCP address instead of a stale static one. sd-autoboot boards DHCP a fresh (random-MAC) IP each boot, and the request-side resolver can't reliably re-acquire a shell on them. Best-effort and fully guarded (never fails the boot; leaves the static address untouched if no IP appears) — so boards whose static address is already correct (bq/mini2) are unaffected.Honest scope
These do not fix the
nemo/adrv9009 daily failure. Validated on the nemo board across three boots: it boots Linux via SD-autoboot but never obtains a DHCP IP within 90 s (the shell works;ip -o addr show scope globalreturns nothing). That's a genuine board/SD-image networking problem — no CI code can fix a board that isn't getting on the network. The daily job is correctly reporting it. This PR just makes such cases diagnosable (clear "no live IP" instead of a misleading "iiod not reachable at 10.0.0.23") and correctly handles sd-autoboot boards that do get a lease.Tests
test_request_uri.py/test_request_core.pygreen; lint clean. The strategy helper is guarded/best-effort and was exercised live on the nemo board (polled, handled the no-IP case gracefully without failing the boot).https://claude.ai/code/session_01BuMAqiic68LrMr6wWC7NCe