Enable Xeon support by jgespino · Pull Request #25 · rh-ai-quickstart/f5-api-security

jgespino · 2026-06-16T18:41:27Z

README improvements

Added Xeon-based local deployment option (Option C) with clear example config
Replaced curl example with port-forwarding

Helm configuration updates

Updated dependency versions in Chart.yaml (llm-service)
Updated rag-values.yaml.example with Xeon model config comments
Updated llama-stack image tag (0.2.23 → 0.6.1)

Added XEON example configuration Bumped llama-stack image version

Added deploying on Xeon Updated test command for vLLM

Bumped dependencies versions

Fixed formatting

ganeshmurthy · 2026-06-25T20:41:05Z

Do we need to add deploy/helm/rag/templates/route-llamastack.yaml LlamaStack route?

The architecture is:
User → Streamlit (exposed via route) → F5 XC Security → LlamaStack (internal only)

By exposing LlamaStack directly with an external route, we're creating a path that bypasses all F5 security controls (WAF, rate limiting, API spec enforcement) that this quickstart is designed to showcase.

The README currently has curl examples that assume this route exists (lines 187-193), but I believe those examples should be updated to use port-forwarding instead:

  # For local testing
  oc port-forward svc/llamastack 8321:8321
  curl -sS http://localhost:8321/v1/models

This allows developers to test LlamaStack without exposing it externally.

Is there a specific reason you are adding that route. Is it necessary to enable Xeon support ?

Updates Verify section to use port-forwarding for llama-stack service testing

jgespino · 2026-06-29T16:55:07Z

@ganeshmurthy Thanks for the feedback! I initially added the route-llamastack to test the llama-stack using the README steps. Based on your suggestion, I’ve removed it and updated the README to use port-forwarding for testing instead.

ganeshmurthy · 2026-06-29T17:22:46Z

Thank you for removing the route-llamastack

To Enable Xeon support, is it necessary to change these versions -

dependencies:
  - name: pgvector
    version: 0.5.6
    repository: https://rh-ai-quickstart.github.io/ai-architecture-charts
  - name: llm-service
    version: 0.5.10
    repository: https://rh-ai-quickstart.github.io/ai-architecture-charts
  - name: llama-stack
    version: 0.8.7
    repository: https://rh-ai-quickstart.github.io/ai-architecture-charts

jgespino · 2026-06-29T17:29:55Z

Only the update to llm-service version 0.5.10 is required, as it includes the 3.4.0 image with Xeon support. I can revert the other components if needed.

ganeshmurthy · 2026-06-29T18:02:05Z

Can you please only update the llm-service version to 0.5.10 in this PR. As for the version changes to pgvector and llama-stack, please do raise a separate PR. This PR should only contain the changes that are necessary to Enable Xeon support

Reverted the dependency version updates for pgvector and llama-stack

jgespino · 2026-06-29T18:12:50Z

Sure, I've made the updates.

ganeshmurthy · 2026-06-29T19:16:18Z

Sorry, this is my last comment

But no corresponding Xeon configuration example was added to deploy/helm/rag-values.yaml.example.

The file only includes:

   # Example Xeon configurations:
   # llama-3-2-3b-instruct:
   #   id: meta-llama/Llama-3.2-3B-Instruct
   #   enabled: true
   #   device: "xeon"
   #   args:
   #   - --max-model-len
   #   - "14336"
   #   - --max-num-seqs
   #   - "32"

Should we either:

Add a llama-3-1-8b-instruct Xeon example to the values file, or
Remove XEON from the 8B model's hardware column in the README if it's not validated/recommended?

Users following the README will expect to find configuration examples for all advertised hardware options.

Added llama-3-1-8b-instruct example for Xeon

jgespino · 2026-06-29T20:01:17Z

Thanks for the feedback! I added llama-3-1-8b-instruct Xeon example to the values file.

jgespino added 5 commits June 16, 2026 10:57

Update rag-values.yaml.example

5777091

Added XEON example configuration Bumped llama-stack image version

Update README.md

69b302b

Added deploying on Xeon Updated test command for vLLM

Update Chart.yaml

1f16a94

Bumped dependencies versions

Create route-llamastack.yaml

88d28be

Update README.md

eaad702

Fixed formatting

keklundrh requested a review from ganeshmurthy June 26, 2026 18:36

jgespino added 2 commits June 29, 2026 09:35

Delete deploy/helm/rag/templates/route-llamastack.yaml

8eb254d

Update README.md

8c659f9

Updates Verify section to use port-forwarding for llama-stack service testing

Fix README formatting

0eb11a5

Update Chart.yaml

6681ec9

Reverted the dependency version updates for pgvector and llama-stack

Update rag-values.yaml.example

bf96928

Added llama-3-1-8b-instruct example for Xeon

ganeshmurthy approved these changes Jun 29, 2026

View reviewed changes

ganeshmurthy merged commit 0cc13b4 into rh-ai-quickstart:main Jun 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable Xeon support#25

Enable Xeon support#25
ganeshmurthy merged 10 commits into
rh-ai-quickstart:mainfrom
jgespino:main

jgespino commented Jun 16, 2026 •

edited

Loading

Uh oh!

ganeshmurthy commented Jun 25, 2026

Uh oh!

jgespino commented Jun 29, 2026

Uh oh!

ganeshmurthy commented Jun 29, 2026

Uh oh!

jgespino commented Jun 29, 2026

Uh oh!

ganeshmurthy commented Jun 29, 2026 •

edited

Loading

Uh oh!

jgespino commented Jun 29, 2026

Uh oh!

ganeshmurthy commented Jun 29, 2026 •

edited

Loading

Uh oh!

jgespino commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jgespino commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ganeshmurthy commented Jun 25, 2026

Uh oh!

jgespino commented Jun 29, 2026

Uh oh!

ganeshmurthy commented Jun 29, 2026

Uh oh!

jgespino commented Jun 29, 2026

Uh oh!

ganeshmurthy commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jgespino commented Jun 29, 2026

Uh oh!

ganeshmurthy commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jgespino commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jgespino commented Jun 16, 2026 •

edited

Loading

ganeshmurthy commented Jun 29, 2026 •

edited

Loading

ganeshmurthy commented Jun 29, 2026 •

edited

Loading