fix(bootstrap): resolve DNS failures and add container memory limits#516
fix(bootstrap): resolve DNS failures and add container memory limits#516brianwtaylor wants to merge 4 commits intoNVIDIA:mainfrom
Conversation
There was a problem hiding this comment.
This seems not necessary now, right?
There was a problem hiding this comment.
good call. no need for these tests with the logic living properly in rust. apologies for the duplicate PR. lost the first while while deploying these changes in my test environment yesterday.
|
explored this change more last night, ran into a memory issue that im working through. would like to get it figured out rather than rushing this change in and introducing a new round of signal noise in the issues section. temporarily moving to draft. -brian |
Docker's embedded DNS at 127.0.0.11 is only reachable from the container's own network namespace. k3s pods in child namespaces cannot reach it, causing silent DNS failures on Ubuntu and other systemd-resolved hosts where /etc/resolv.conf contains 127.0.0.53. Sniff upstream DNS resolvers from the host in the Rust bootstrap crate by reading /run/systemd/resolve/resolv.conf (systemd-resolved only — intentionally does NOT read /etc/resolv.conf to avoid bypassing Docker Desktop's DNAT proxy on macOS/Windows). Filter loopback addresses (127.x.x.x and ::1) and pass the result to the container as the UPSTREAM_DNS env var. Skip DNS sniffing for remote deploys where the local host's resolvers would be wrong. The entrypoint checks UPSTREAM_DNS first, falling back to /etc/resolv.conf inside the container for manual launches. This follows the existing pattern used by registry config, SSH gateway, GPU support, and image tags. Closes NVIDIA#437 Signed-off-by: Brian Taylor <brian.taylor818@gmail.com>
Drop deploy/docker/tests/test-dns-resolvers.sh — the resolver logic now lives in the Rust bootstrap crate with cargo test coverage, making the standalone shell harness redundant. Signed-off-by: Brian Taylor <brian.taylor818@gmail.com>
Move the resolv.conf parsing logic out of resolve_upstream_dns() into its own parse_resolv_conf() function. The 10 deterministic tests now exercise the production code path instead of a reimplemented helper. Signed-off-by: Brian Taylor <brian.taylor818@gmail.com>
Add --memory flag to `openshell gateway start` that caps the gateway container's memory via Docker HostConfig. When unset, auto-detects 80% of available memory by querying the Docker daemon (docker info), which correctly reports the Docker Desktop VM's allocated memory on macOS and Windows rather than the full host RAM. Docker OOM-kills the container instead of letting runaway sandbox growth trigger the host kernel OOM killer. - parse_memory_limit(): human-readable sizes (80g, 4096m, bytes) - detect_memory_limit(): async, queries Docker daemon MemTotal - memory_swap = memory (disables swap inside container) - OPENSHELL_MEMORY_LIMIT env var supported Signed-off-by: Brian Taylor <brian.taylor818@gmail.com>
536bd97 to
2175d02
Compare
|
We are going to refactor the bootstrapping workflow to use a microVM instead of Docker based k3s cluster. This should improve some of the reliability on the networking side. I would hold off on this PR until we get moved over to the microVM based approach. Discussion on the topic is here #558. |
Supersedes #478
@drew — reworked per your review: the bootstrap crate now sniffs resolvers and passes them as an
UPSTREAM_DNSenv var. No system files are mounted into the container.Summary
/run/systemd/resolve/resolv.conf(systemd-resolved hosts only)UPSTREAM_DNSenv varUPSTREAM_DNSfirst, falls back to/etc/resolv.conffor manual launches--memoryflag toopenshell gateway startthat caps container memory via Docker HostConfigOPENSHELL_MEMORY_LIMITenv varCloses #437
Changes
crates/openshell-bootstrap/src/docker.rs— Addresolve_upstream_dns()that reads/run/systemd/resolve/resolv.conf, filters loopback addresses, and returns real upstream resolvers. Pass them asUPSTREAM_DNSenv var to the cluster container (skipped for remote deploys). Includes unit tests.deploy/docker/cluster-entrypoint.sh— Addget_upstream_resolvers()that readsUPSTREAM_DNSenv var (priority) or falls back to/etc/resolv.conf. When upstream resolvers are found, write them directly to the k3s resolv.conf instead of relying on DNAT proxy. Improve DNS verification logging on failure.crates/openshell-bootstrap/src/docker.rs— Addparse_memory_limit()for human-readable size parsing anddetect_memory_limit()that queries Docker daemon MemTotal. Sets memory_swap = memory to disable swap inside the container.crates/openshell-cli/src/main.rs/bootstrap.rs— Add--memoryCLI flag, wire through to Docker container creation.Root Cause
Docker's embedded DNS at
127.0.0.11is only reachable from the container's own network namespace. The existing DNAT rules forward to this loopback address, but k3s pods run in child network namespaces where the forwarded packets are dropped as martian packets. On systemd-resolved hosts,/etc/resolv.confcontains127.0.0.53(another loopback), so the fallback also fails silently.DNS Flow — Before vs After
Testing
Software Versions
Results
cgroupns=host)===VALIDATION TOPOLOGY===
===WHAT EACH NODE PROVED DURING VALIDATION===
Node A ─── Baseline capture. systemd-resolved active, upstream 192.168.4.1,
stub at 127.0.0.53. All existing containers healthy. Zero drift.
Node B ─── Fix works. Custom binary + image from PR branch deployed.
UPSTREAM_DNS=192.168.4.1 set, written to k3s resolv.conf.
Pod DNS resolution verified. All pods healthy.
Node C ─── No change on macOS. No UPSTREAM_DNS set. DNAT proxy path intact.
Pod DNS resolution verified.
Node D ─── No change on WSL2. No UPSTREAM_DNS set. DNAT proxy path intact.
Pod DNS resolution verified.
Automated Tests
Checklist