Triage tiered fallback: cloud → wait-and-retry → fallback-local → defer

## Summary

Replace triage's current binary "cloud OR local-only" routing with a tiered fallback chain: cloud first → on 429 / transient failure wait-and-retry → on persistent failure fall through to local → on local failure defer the turn (via `JobOutcome::Defer`).

## Problem

`agent/triage/evaluator.rs` currently picks one of `cloud` or `local` based on `resolve_provider`, runs the call, and surfaces any failure to the orchestrator. There's no graceful degradation:

- 429 from the cloud → triage fails → user-visible toast.
- Local model not loaded yet → triage fails the same way.
- Cloud is hard-down → no automatic fallback to local even though the local arm is fully wired.

Every other tiered system (HTTP retry, mailgun, S3) has this fallback chain. Triage doesn't because the routing logic predates the local-arm gating from #1073.

## Solution

In `triage/evaluator.rs::evaluate`:

1. **Cloud first** when `resolve_provider == cloud`.
2. **On 429**: cooperate with the cloud rate limiter (separate issue) — wait `Retry-After`, retry once. After the second 429, fall through.
3. **On transient cloud failure** (timeout, 5xx, connection): retry once with backoff, then fall through.
4. **Fallback-local**: acquire `LlmPermit` (already wired by #1073) and run the same triage prompt against the local model. Mark the resolution path in the response so callers can tell it was a fallback.
5. **On local failure**: return `JobOutcome::Defer` (separate issue) with a short wake-up so the next tick retries the whole chain. Today's "hard fail" is the wrong default for a transient blocker.

Path tracking matters: the orchestrator should know whether triage hit cloud, fell back to local, or deferred — so it can color-code degraded turns and surface the state in `/debug` views.

## Acceptance criteria

- [ ] **Tiered fallback** — cloud → retry → local → defer chain implemented in `triage/evaluator.rs`.
- [ ] **Cloud retry budget** — at most 2 attempts (initial + 1 retry) before fallthrough; `Retry-After` honored.
- [ ] **Resolution path is observable** — response carries which arm produced it (`cloud` / `cloud-after-retry` / `local-fallback` / `deferred`).
- [ ] **Tests** — happy cloud, 429 → retry → ok, 429 → retry → 429 → local fallback, cloud 5xx → local fallback, local fail → defer.
- [ ] **Diff coverage ≥ 80%** — implementing PR meets the changed-lines coverage gate (Vitest + cargo-llvm-cov, enforced by [`.github/workflows/coverage.yml`](../../.github/workflows/coverage.yml)).

## Related

- #1073 (PR #1252) — local LLM throttling lists this as a deferred follow-up.
- Depends on cloud LLM rate limiter (separate issue) for `Retry-After` cooperation and bucket awareness.
- Depends on `JobOutcome::Defer` (separate issue) for the final deferred state.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triage tiered fallback: cloud → wait-and-retry → fallback-local → defer #1257

Summary

Problem

Solution

Acceptance criteria

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Triage tiered fallback: cloud → wait-and-retry → fallback-local → defer #1257

Description

Summary

Problem

Solution

Acceptance criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions