Skip to content

single-port-server.js crashes on ECONNRESET — unhandled 'error' event takes down dashboard #2103

@Bdandc

Description

@Bdandc

Summary

When AO_PATH_BASED_MUX=1 is set, the bundled single-port-server.js proxy crashes with an unhandled error event the first time a downstream client abruptly resets a TCP connection. The dashboard goes 502 with no auto-restart.

Concrete repro is a Cloudflare Tunnel in front of ao start — CF's connection pool churn produces ECONNRESETs as normal behaviour, so the proxy dies within minutes of going public.

Stack trace from a real failure

[single-port] [single-port] listening on 3000; HTTP → 127.0.0.1:4000; /ao-terminal-mux → 127.0.0.1:14801/mux
[single-port] node:events:487
[single-port]       throw er; // Unhandled 'error' event
[single-port]       ^
[single-port] Error: read ECONNRESET
[single-port]     at TCP.onStreamRead (node:internal/stream_base_commons:216:20)
[single-port] Emitted 'error' event on Socket instance at:
[single-port]     at emitErrorNT (node:internal/streams/destroy:170:8)
[single-port]     at emitErrorCloseNT (node:internal/streams/destroy:129:3)
[single-port]     at process.processTicksAndRejections (node:internal/process/task_queues:90:21) {
[single-port]   errno: -54,
[single-port]   code: 'ECONNRESET',
[single-port]   syscall: 'read'
[single-port] }
[single-port] Node.js v25.9.0
[single-port] exited with code 1

Why this matters

The header comment in single-port-server.js explicitly describes this file as the supported way to put AO behind a reverse proxy that can only forward one hostname:port pair upstream — the exact Cloudflare Tunnel use case. But the file in 0.9.4 isn't robust enough for that role: any client-side reset crashes the whole proxy, and lifecycle-manager doesn't restart it. Anyone trying to follow the "single proxy rule pointing at PORT is sufficient" guidance in the header comment will hit this on day one.

Suggested fix shape

Add 'error' listeners on both the client and upstream Socket instances inside createSinglePortServer, plus a top-level process.on('uncaughtException', err => { ... log + continue }. Standard node-http-proxy-style hardening — the listener can log + close the dead socket without taking down serverListen.

Workaround for operators

Until this lands, the deployment-time workaround is to leave AO_PATH_BASED_MUX unset (default mode) and have your reverse proxy do the path-based routing itself. For Caddy:

:8080 {
    basicauth { bdandc <hash> }
    handle /ao-terminal-mux* {
        rewrite * /mux
        reverse_proxy localhost:14801
    }
    handle {
        reverse_proxy localhost:3000
    }
}

That bypasses single-port-server.js entirely. Caddy survives ECONNRESET gracefully and is already in the request path so it adds zero new failure surface.

Environment

  • @aoagents/ao v0.9.4 (installed via npm i -g @aoagents/ao)
  • Node.js v25.9.0
  • macOS Apple Silicon
  • Cloudflare Tunnel (cloudflared) in front of Caddy in front of ao start

Related

Filing this separately from #2102 (agent.isProcessRunning indeterminate log noise) because the fix lives in a different file and the failure mode is much more visible.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions