Enable Xeon support#25
Conversation
Added XEON example configuration Bumped llama-stack image version
Added deploying on Xeon Updated test command for vLLM
Bumped dependencies versions
Fixed formatting
|
Do we need to add The architecture is: By exposing LlamaStack directly with an external route, we're creating a path that bypasses all F5 security controls (WAF, rate limiting, API spec enforcement) that this quickstart is designed to showcase. The README currently has curl examples that assume this route exists (lines 187-193), but I believe those examples should be updated to use port-forwarding instead: This allows developers to test LlamaStack without exposing it externally. Is there a specific reason you are adding that route. Is it necessary to enable Xeon support ? |
Updates Verify section to use port-forwarding for llama-stack service testing
|
@ganeshmurthy Thanks for the feedback! I initially added the route-llamastack to test the llama-stack using the README steps. Based on your suggestion, I’ve removed it and updated the README to use port-forwarding for testing instead. |
|
Thank you for removing the route-llamastack To |
|
Only the update to llm-service version 0.5.10 is required, as it includes the 3.4.0 image with Xeon support. I can revert the other components if needed. |
|
Can you please only update the |
Reverted the dependency version updates for pgvector and llama-stack
|
Sure, I've made the updates. |
|
Sorry, this is my last comment This line was added to the README: But no corresponding Xeon configuration example was added to The file only includes: Should we either:
Users following the |
Added llama-3-1-8b-instruct example for Xeon
|
Thanks for the feedback! I added llama-3-1-8b-instruct Xeon example to the values file. |
README improvements
Helm configuration updates