Add vLLM cache to TTS model #4

huwenjie333 · 2026-01-09T11:26:27Z

This PR adds the vLLM cache to the persist storage in the deployment at /root/.cache/vllm. It reduces cold start time from 1.5 mins to 1 min.

update

43aa1df

huwenjie333 requested a review from jqug January 9, 2026 11:26

huwenjie333 assigned PatrickCmd and unassigned PatrickCmd Jan 9, 2026

huwenjie333 requested a review from PatrickCmd January 9, 2026 11:26

comments

8ae1ac7

Provide feedback