Skip to content

Voice download fails permanently on first ECONNRESET — no retry, misleading error message #108

@fitz123

Description

@fitz123

Problem

When the bot tries to download a voice file from api.telegram.org and the first TLS handshake fails with ECONNRESET (transient network blip), the user sees Failed to transcribe voice message and loses the audio. There is no retry.

Two things are wrong:

  1. downloadFile has zero retry logic — a single transient network failure kills the flow
  2. The user-facing error incorrectly blames transcription, when the real failure happened before whisper was ever invoked

Evidence

bot/src/voice.ts:24-31:

export async function downloadFile(url: string, destPath: string): Promise<void> {
  const resp = await fetch(url);
  if (!resp.ok) {
    throw new Error(`Download failed: HTTP ${resp.status}`);
  }
  const buffer = Buffer.from(await resp.arrayBuffer());
  await writeFile(destPath, buffer, { mode: 0o600 });
}

Single fetch(), no retry, no special handling for network-layer failures (which throw before resp.ok is reached).

Stderr log sample (redacted):

ERROR [telegram-bot] Voice transcription error for chat <redacted-chat-id>: [TypeError: fetch failed] {
  [cause]: Error: Client network socket disconnected before secure TLS connection was established
      at TLSSocket.onConnectEnd (node:internal/tls/wrap:1708:19)
      ...
    code: 'ECONNRESET',
    host: 'api.telegram.org',
    port: 443
  }
}

Frequency on an operator machine: 25 occurrences over 30 days (sporadic, ~0–3 per day). Users experience this as a random "voice recognition broke" — but nothing about voice/whisper actually failed.

Root Cause

Two root causes stacked:

  1. No retry on transient fetch failure. The cause chain is fetch failedECONNRESET during TLS setup — a network-layer fault, not an HTTP status. It's exactly the class of error that benefits from a small retry with backoff. downloadFile catches neither.

  2. Error message lies about which layer failed. The handler that wraps this failure logs and reports "Voice transcription error" + "Failed to transcribe voice message", regardless of whether the failure was in downloadFile, convertToWav (ffmpeg), or transcribeAudio (whisper). The user can't tell the difference — they send the voice again, it might download fine this time, and the "bug" feels random.

Suggested Direction

  • Add bounded retry with backoff to downloadFile (e.g. 3 attempts, 500ms / 1s / 2s) scoped to network-layer errors — not HTTP errors, which indicate real problems (404, 401) that won't be fixed by retry.
  • Distinguish the user-facing message by failure stage: download failure vs conversion failure vs transcription failure. A download failure is a network issue the user can retry in a few seconds; a transcription failure is an audio-content or model problem that won't benefit from resending.
  • Consider whether other fetch() calls to api.telegram.org have the same vulnerability (photo/document download in the same module pattern).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions