feat: wire MPS (Apple Silicon) device into fine-tuning wizard

## Context

Issue #26 tracks MPS support generally. The fine-tuning wizard (added in #33) defaults to `device_map="auto"` in both `trainer/qlora.py` and the inference cache in `app/api.py`. On Apple Silicon, `bitsandbytes` 4-bit quantization (used by QLoRA) is not supported — attempting to run QLoRA on MPS will either fall back to CPU or error out.

## What needs to happen

1. **Detect device at wizard start** — call `GET /api/gpu` on page mount in `FinetuneState` and store the backend (`cuda` / `mps` / `cpu`).

2. **Disable QLoRA on MPS** — if `backend == "mps"`, grey out the QLoRA technique button with tooltip: _"QLoRA requires CUDA. Switch to LoRA for Apple Silicon."_ Auto-select LoRA instead.

3. **Set correct dtype in loader** — `trainer/loader.py` should pass `torch_dtype=torch.float16` and skip `BitsAndBytesConfig` when MPS is detected.

4. **Inference endpoint** — `app/api.py::infer` hardcodes `torch_dtype=torch.float16` and `device_map="auto"`. On MPS this needs `.to("mps")` instead of `device_map="auto"` (which is CUDA-only).

## Acceptance criteria

- [ ] Wizard detects MPS and auto-selects LoRA on Apple Silicon machines
- [ ] QLoRA button is visually disabled with an explanatory tooltip on MPS
- [ ] Training runs without error on M1/M2/M3 Mac using LoRA (no quantization)
- [ ] Test chat inference works on MPS after training

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: wire MPS (Apple Silicon) device into fine-tuning wizard #34

Context

What needs to happen

Acceptance criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat: wire MPS (Apple Silicon) device into fine-tuning wizard #34

Description

Context

What needs to happen

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions