Improve /talk handling and lazy Whisper load#25
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| elif "multipart/form-data" in content_type or "application/x-www-form-urlencoded" in content_type: | ||
| form = await http_request.form() | ||
| audio_file = form.get("audio") if isinstance(form, dict) else None | ||
| if audio_file is not None and not isinstance(audio_file, UploadFile): | ||
| audio_file = None |
There was a problem hiding this comment.
Audio form uploads ignored in /talk
The new /talk handler now calls http_request.form() but only extracts audio when the returned object is a plain dict. Starlette returns FormData, so isinstance(form, dict) is false and audio_file remains None even when clients send multipart audio. As a result the audio branch never runs, STT is skipped, and audio requests are rejected with a 400 (“Provide either text prompt or valid audio”) despite valid uploads. Use the FormData mapping interface directly instead of the dict check.
Useful? React with 👍 / 👎.
…odex/fix-json-parsing-issue-in-talk-endpoint-4dfpqv
…-/talk-handler Fix /talk audio form handling
…-/talk-handler-m0dgy7 Fix /talk audio form handling
…-/talk-handler-9l8w72 Handle LLM timeouts and HTTP errors
Summary
Testing
python -m compileall backendimport asyncio
import types
import sys
fake_hf = types.ModuleType("huggingface_hub")
class DummyInferenceClient:
def init(self, *args, **kwargs):
pass
def chat_completion(self, *args, **kwargs):
class R:
choices = [ {"message": {"content": "stub reply"}} ]
return R()
fake_hf.InferenceClient = DummyInferenceClient
sys.modules.setdefault("huggingface_hub", fake_hf)
fake_whisper = types.ModuleType("whisper")
fake_whisper.load_model = lambda *args, **kwargs: types.SimpleNamespace(transcribe=lambda path, language=None: {"text": "hello"})
sys.modules.setdefault("whisper", fake_whisper)
from backend import main
main.ask_llm = lambda prompt: f"echo {prompt}"
main.tts_elevenlabs = lambda text, voice_id=None: b"mp3bytes"
class FakeRequest:
def init(self, payload: dict):
self._payload = payload
self.headers = {"content-type": "application/json"}
async def json(self):
return self._payload
async def form(self):
return None
req = FakeRequest({"prompt": "hi there", "voice": "abc"})
response = asyncio.run(main.talk(req))
print(type(response).name, response.status_code, len(response.body))
PY`
Codex Task