Streaming inference in deploy tab — replace polling with SSE

## Current behavior

The test-chat in wizard step 6 (Deploy tab) sends a `POST /api/jobs/{id}/infer` request and waits for the full response before displaying any output. For larger models or long outputs this creates a noticeable delay with no feedback.

## Proposed change

Replace the single-response endpoint with a **Server-Sent Events (SSE)** stream:

```
GET /api/jobs/{id}/infer/stream?prompt=...
```

The server yields tokens as they are generated:

```
data: {"token": "Hello"}\n\n
data: {"token": ","}\n\n
data: {"token": " world"}\n\n
data: [DONE]\n\n
```

The Reflex frontend consumes the stream and appends tokens to the chat bubble in real time.

## Implementation notes

- FastAPI supports SSE via `fastapi.responses.StreamingResponse` with `media_type="text/event-stream"`.
- The `transformers` `TextIteratorStreamer` can be used to stream tokens from the model.
- The Reflex frontend can use a `fetch` call with `ReadableStream` and dispatch state updates via `rx.set_state` (or a polling shim if full SSE support is unavailable in the current Reflex version).
- Keep the existing non-streaming `POST /infer` endpoint for backward compatibility.

## Acceptance criteria

- [ ] Tokens appear incrementally in the chat bubble as they are generated
- [ ] SSE connection closes cleanly on `[DONE]`
- [ ] Existing non-streaming endpoint still passes its tests
- [ ] Works with both CPU and CUDA inference backends

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming inference in deploy tab — replace polling with SSE #42

Current behavior

Proposed change

Implementation notes

Acceptance criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Streaming inference in deploy tab — replace polling with SSE #42

Description

Current behavior

Proposed change

Implementation notes

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions