ECA becomes unresponsive when tool output causes LLM context overflow

Description

When a tool (especially shell commands) produces very large output, ECA hits the LLM provider's token limit with:

```
LLM response status: 400 body: {"error":{"message":"prompt token count of 174837 exceeds the limit of 168000","code":"model_max_prompt_tokens_exceeded"}}
```

After this error, the ECA server and Emacs become unresponsive/frozen, requiring a restart.

### Root Cause Analysis

The issue involves multiple contributing factors on both server and client side.

#### Server-side

1. **No proactive token budget check before API call**: The server sends the full message history to the LLM provider without estimating token usage. Context overflow is only detected reactively (HTTP 400 response) and recovered via prune + auto-compact. While the recovery logic works, it doesn't prevent the initial expensive roundtrip and the cascade of effects on the client.

2. **Tool output truncation exists but may be insufficient**: `outputTruncation` defaults to 2000 lines / 50 KB, but a single tool result at the maximum (50 KB ≈ 12,500 tokens) combined with a long conversation history can still exceed the context limit. The truncation also doesn't account for the cumulative token budget across all messages.

3. **Potential infinite recovery loop**: If auto-compact succeeds but the subsequent resume prompt still triggers context overflow, `auto-compacting?` has already been reset to `false`, allowing another compact cycle. There is no counter or `recovery-attempted?` guard to prevent `Compact → Resume → Overflow → Compact → …` loops (analogous to the existing `auto-continued?` guard for truncated responses).

#### Client-side (Emacs)

4. **Synchronous JSON parsing of large messages**: The process filter in `eca-process.el` parses incoming JSON messages synchronously with no size limit. Before the overflow error, the server sends `toolCalled` notifications containing the full tool output. Parsing a multi-MB JSON message blocks the Emacs main thread.

5. **`font-lock-ensure` refontifies entire buffer**: After each text content insertion (`eca-chat.el`), `font-lock-ensure` runs over the entire chat buffer. In `gfm-mode` with `markdown-fontify-code-blocks-natively`, this becomes very expensive for large buffers.

6. **`align-tables` / `beautify-tables` scan entire buffer**: On `"finished"` status, both functions are called with `(point-min)`, scanning the entire chat buffer for markdown tables — expensive for long conversations.

7. **Expandable content has no size limit**: Tool call outputs are stored and rendered in expandable overlays without any truncation. Large outputs bloat the buffer.

These client-side issues compound: large tool output → expensive JSON parse → expensive rendering → expensive fontification → expensive table alignment, all running synchronously on the Emacs main thread.

### Steps to Reproduce

1. Start an ECA chat session with a model that has a moderate context window (e.g., 128K-168K tokens)
2. Have a conversation long enough to consume a significant portion of the context
3. Execute a tool/shell command that produces very large output (e.g., `find / -name "*.log"` or similar)
4. The LLM provider returns HTTP 400 (token limit exceeded)
5. Emacs becomes unresponsive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ECA becomes unresponsive when tool output causes LLM context overflow #391

Root Cause Analysis

Server-side

Client-side (Emacs)

Steps to Reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ECA becomes unresponsive when tool output causes LLM context overflow #391

Description

Root Cause Analysis

Server-side

Client-side (Emacs)

Steps to Reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions