Skip to content

fix: security hardening batch 1 (SEC-002 through SEC-010)#548

Merged
johntmyers merged 10 commits intomainfrom
fix/security-hardening-batch-1
Mar 23, 2026
Merged

fix: security hardening batch 1 (SEC-002 through SEC-010)#548
johntmyers merged 10 commits intomainfrom
fix/security-hardening-batch-1

Conversation

@johntmyers
Copy link
Collaborator

Summary

Defensive hardening fixes for 9 security findings from an external PSIRT review (SEC-002 through SEC-010). Each finding is addressed in a separate commit. Larger architectural changes (structured exec protocol, SSH host key pinning) are deferred to follow-up issues.

Changes

High Priority

  • SEC-009 (HTTP Request Smuggling): Reject requests with both Content-Length and Transfer-Encoding headers per RFC 7230 §3.3.3. Replace String::from_utf8_lossy with strict UTF-8 validation. Reject bare LF line endings and invalid HTTP versions.

Medium Priority

  • SEC-002 (Shell Injection Hardening): Reject null bytes and newlines in shell_escape(). Add input validation in exec_sandbox (control character rejection, size limits). Enforce 256 KiB total command string limit.
  • SEC-003 (SSH Transport Boundary): Add defense-in-depth null byte and length checks in run_exec_with_russh before channel.exec(). Enhanced exec logging with command length and truncated preview.
  • SEC-007 (Port Truncation): Validate port_to_connect <= 65535 before u32-to-u16 cast. Replace string-literal loopback check with is_loopback_host() covering full 127.0.0.0/8, IPv4-mapped IPv6, and bracketed forms.
  • SEC-008 (Error Message Leakage): Replace verbatim internal error strings in router_error_to_http with generic messages. Full error context remains logged server-side at warn level.
  • SEC-006 (SSH Proxy SSRF): Add IP validation in start_single_use_ssh_proxy — resolve DNS before connecting, block loopback and link-local (cloud metadata) addresses, connect to validated SocketAddr to prevent TOCTOU.

Lower Priority

  • SEC-010 (Chunked Body DoS): Replace unchecked +2 additions with checked_add. Add MAX_CHUNKED_BODY (10 MiB) and MAX_CHUNK_COUNT (4096) limits.
  • SEC-004 (CLI Double-Shell Escape): Apply shell_escape to inner_cmd in SSH path for double-shell-interpretation safety. Add validate_gateway_name and validate_ssh_host.
  • SEC-005 (CIDR Breadth Warning): Log warnings for allowed_ips CIDRs broader than /16. Block control-plane ports (K8s API, etcd, kubelet) unconditionally in resolve_and_check_allowed_ips.

Testing

  • mise run test passes (0 failures across all crates)
  • Unit tests added/updated for every fix
  • E2E tests added/updated (not applicable — changes are all in validation/parsing logic)

Checklist

  • Follows Conventional Commits
  • Each finding in its own commit for reviewability
  • Architecture docs updated (follow-up issues needed for SEC-002/SEC-003 architectural changes)

Harden the L7 REST proxy HTTP parser against request smuggling:

- Reject requests containing both Content-Length and Transfer-Encoding
  headers per RFC 7230 Section 3.3.3 (CL/TE ambiguity)
- Replace String::from_utf8_lossy with strict UTF-8 validation to
  prevent interpretation gaps with upstream servers
- Reject bare LF line endings (require CRLF per HTTP spec)
- Validate HTTP version string (HTTP/1.0 or HTTP/1.1 only)
Harden the gRPC exec handler against command injection via the
structured-to-shell-string conversion:

- Reject null bytes and newlines/carriage returns in shell_escape()
- Add input validation in exec_sandbox: reject control characters in
  command args, env values, and workdir
- Enforce size limits: max 1024 args, 32 KiB per arg/value, 4 KiB
  workdir, 256 KiB total assembled command string
- Change shell_escape and build_remote_exec_command to return Result
  so callers must handle validation failures
Add defense-in-depth validation in run_exec_with_russh before sending
the command to the sandbox SSH server:

- Reject null bytes in command string at transport boundary
- Enforce max command length (256 KiB) at transport boundary
- Enhance stream_exec_over_ssh logging with command length, stdin
  length, and truncated command preview for audit trail
…EC-007)

Harden the SSH direct-tcpip channel handler:

- Validate port_to_connect <= 65535 before u32-to-u16 cast to prevent
  port truncation (e.g., 65537 becoming port 1)
- Replace string-literal loopback check with is_loopback_host() that
  covers the full 127.0.0.0/8 range, IPv4-mapped IPv6 (::ffff:127.x),
  bracketed IPv6, and case-insensitive localhost
- Remove #[allow(clippy::cast_possible_truncation)] since the cast is
  now proven safe by the preceding range check
…SEC-008)

Replace verbatim internal error strings in router_error_to_http with
generic messages to prevent information leakage to sandboxed code.
Upstream URLs, internal hostnames, TLS details, and file paths are no
longer exposed. Full error context is still logged server-side at warn
level by the caller for debugging.
Add IP validation in start_single_use_ssh_proxy to prevent SSRF if a
sandbox status record were poisoned:

- Resolve DNS before connecting and validate the resolved IP
- Block loopback (127.0.0.0/8) and link-local (169.254.0.0/16, covers
  cloud metadata endpoint) addresses
- Block IPv4-mapped IPv6 variants of the same ranges
- Connect to the validated SocketAddr directly to prevent TOCTOU
- Add debug logging of resolved target IP for audit
Harden parse_chunked_body in the inference interception path:

- Replace all unchecked +2 additions with checked_add for consistent
  overflow safety across all target architectures
- Add MAX_CHUNKED_BODY (10 MiB) to cap decoded body size
- Add MAX_CHUNK_COUNT (4096) to prevent CPU exhaustion via tiny chunks
- Early-reject chunk sizes larger than remaining buffer space
…(SEC-004)

Harden doctor_exec against command injection in the SSH remote path:

- Apply shell_escape to inner_cmd in the SSH path so it survives the
  double shell interpretation (SSH remote shell + sh -lc). This also
  fixes a correctness bug where multi-word commands were silently
  broken in the SSH path.
- Add validate_gateway_name to reject shell metacharacters in gateway
  names before use in container_name
- Add validate_ssh_host to reject metacharacters in remote_host loaded
  from metadata.json
…st (SEC-005)

Defense-in-depth for the allowed_ips feature:

- Log a warning when a CIDR entry has a prefix length < /16, as overly
  broad ranges may unintentionally expose control-plane services
- Block K8s API (6443), etcd (2379/2380), and kubelet (10250/10255)
  ports unconditionally in resolve_and_check_allowed_ips, even when
  the resolved IP matches an allowed_ips entry
@johntmyers johntmyers requested a review from a team as a code owner March 23, 2026 16:46
@johntmyers johntmyers added the topic:security Security issues label Mar 23, 2026
@johntmyers johntmyers self-assigned this Mar 23, 2026
@johntmyers johntmyers added topic:security Security issues test:e2e Requires end-to-end coverage labels Mar 23, 2026
…C-008)

The SEC-008 fix changed the error message from 'no compatible route
for source protocol ...' to 'no compatible inference route available'.
Update the E2E assertion substring to match.
@johntmyers johntmyers merged commit 834f8aa into main Mar 23, 2026
9 checks passed
@johntmyers johntmyers deleted the fix/security-hardening-batch-1 branch March 23, 2026 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage topic:security Security issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants